List Info

Thread: Equitation of regression curve for line diagram




Equitation of regression curve for line diagram
user name
2007-07-28 10:10:36
Hello,

because there is no obvious solution, I'll bring this to
discussion here:

Example table
x	y
-2	-4,84
-1	-1,74
0	0
1	2,38
2	3,82
3	7,17
4	9,28

gives for a linear regression the equitation (as shown in
status bar)
y= 2.286 ∙ x − 6.847 for a line-diagram
y= 2.286 ∙ x + 0.01 for an XY-diagram

The wrong equitation in the line-diagram is due to the fact,
that for 
line diagrams not the real x-values but the values 1, 2, …
are used.

The question is now, whether the regression curve should do
so and it is 
enough, if we tell it to the user. Or should the calculation
use the 
x-values of the data series, if they provide a datatyp to
calculate with?

kind regards
Regina

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribegraphics.openoffice.org
For additional commands, e-mail: dev-helpgraphics.openoffice.org


Re: Equitation of regression curve for line diagram
user name
2007-07-30 03:17:12
Hi Regina,
> Hello,
>
> because there is no obvious solution, I'll bring this
to discussion here:
>
> Example table
> x    y
> -2    -4,84
> -1    -1,74
> 0    0
> 1    2,38
> 2    3,82
> 3    7,17
> 4    9,28
>
> gives for a linear regression the equitation (as shown
in status bar)
> y= 2.286 ∙ x − 6.847 for a line-diagram
> y= 2.286 ∙ x + 0.01 for an XY-diagram
>
> The wrong equitation in the line-diagram is due to the
fact, that for 
> line diagrams not the real x-values but the values 1,
2, … are used.
>
> The question is now, whether the regression curve
should do so and it 
> is enough, if we tell it to the user. Or should the
calculation use 
> the x-values of the data series, if they provide a
datatyp to 
> calculate with?

The problem here is that there is this fundamental
difference in line 
charts and scatter charts. A line chart uses categories
(which are 
strings) and has equidistant data points. The
"x-values" in this example 
are names for categories for all data series of a line
chart. If you 
want them to be x-values, you have to chose scatter as
chart-type.

The data series does not know any x-values in a line chart,
so neither 
does the regression curve. In addition your example works
only because 
the x-values are by chance equidistant. If they weren't, a
line chart 
would show the data points still equidistant. A regression
curve using 
the non-equidistant x-values would simply be wrong (the
graphs would not 
fit).

The only chance I see for this dilemma is to guide the user
somehow that 
he should use a scatter chart when the
"categories" are numbers. In 
Excel there is an automatism that uses scatter charts even
if you select 
a line chart type, when the categories are all numbers.
However, I would 
prefer to make this clear to the user.

Regards,
Bjoern

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribegraphics.openoffice.org
For additional commands, e-mail: dev-helpgraphics.openoffice.org


Re: Equitation of regression curve for line diagram
user name
2007-07-30 09:28:11
Hi Bjoern,

Bjoern Milcke schrieb:
> Hi Regina,
>> Hello,
>>
>> because there is no obvious solution, I'll bring
this to discussion here:
>>
>> Example table
>> x    y
>> -2    -4,84
>> -1    -1,74
>> 0    0
>> 1    2,38
>> 2    3,82
>> 3    7,17
>> 4    9,28
>>
>> gives for a linear regression the equitation (as
shown in status bar)
>> y= 2.286 ∙ x − 6.847 for a line-diagram
>> y= 2.286 ∙ x + 0.01 for an XY-diagram
>>
>> The wrong equitation in the line-diagram is due to
the fact, that for 
>> line diagrams not the real x-values but the values
1, 2, … are used.
>>
>> The question is now, whether the regression curve
should do so and it 
>> is enough, if we tell it to the user. Or should the
calculation use 
>> the x-values of the data series, if they provide a
datatyp to 
>> calculate with?
> 
> The problem here is that there is this fundamental
difference in line 
> charts and scatter charts. A line chart uses categories
(which are 
> strings) and has equidistant data points. The
"x-values" in this example 
> are names for categories for all data series of a line
chart. If you 
> want them to be x-values, you have to chose scatter as
chart-type.

I know that, but I see often users simple click type
"line" to get a 
chart with lines, with all the problems.

> 
> The data series does not know any x-values in a line
chart

Why not? The data range is known so it should be possible to
look 
whether there are numbers.

, so neither
> does the regression curve. In addition your example
works only because 
> the x-values are by chance equidistant. If they
weren't, a line chart 
> would show the data points still equidistant. A
regression curve using 
> the non-equidistant x-values would simply be wrong (the
graphs would not 
> fit).

The equitations are wrong in the most cases. The question
is, where the 
user should be told, that the shown equitation doesn't fit
to their data 
series numbers.

> 
> The only chance I see for this dilemma is to guide the
user somehow that 
> he should use a scatter chart when the
"categories" are numbers.

I can suggest some text for the online help (would be issue
77929, help 
file /text/schart/01/04050100.xhp), that's no problem. But
will a user 
with small mathematical knowledge notice, that something is
wrong and 
then look into the help? A warning when creating the chart
would be more 
helpful.

  In
> Excel there is an automatism that uses scatter charts
even if you select 
> a line chart type, when the categories are all
numbers.

In my Excel97 and Excel2007 there is no such automatism.
Excel creates 
the line chart without warning and shows the
"wrong" equitation in the 
chart.

  However, I would
> prefer to make this clear to the user.

Shall I write an issue for displaying a warning, when
creating the chart 
or do you think it is enough to add an explanation to the
help?

kind regards
Regina

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribegraphics.openoffice.org
For additional commands, e-mail: dev-helpgraphics.openoffice.org


Re: Equitation of regression curve for line diagram
user name
2007-07-30 14:44:58
Hi,

I would explicitly warn the user.

The chart drawing engine should test, IF ALL(x) are numbers.
IF this is 
true, then a warning should be displayed. The user most
probably wanted 
these explicit x-values. [A possible exception to the
previous comment 
would be IF x stores a date. Despite x being a number - as
dates are 
basically stored as numbers in Calc - they are not supposed
to be used 
in the calculation of the regression line. But it would be
difficult to 
detect dates, unless a new data type "Date" would
be introduced - and 
NOT merely a 'display-like-date' flag.]

Sincerely,

Leonard


Regina Henschel wrote:
> Hi Bjoern,
>
> Bjoern Milcke schrieb:
>> Hi Regina,
>>> Hello,
>>>
>>> because there is no obvious solution, I'll
bring this to discussion 
>>> here:
>>>
>>> Example table
>>> x    y
>>> -2    -4,84
>>> -1    -1,74
>>> 0    0
>>> 1    2,38
>>> 2    3,82
>>> 3    7,17
>>> 4    9,28
>>>
>>> gives for a linear regression the equitation
(as shown in status bar)
>>> y= 2.286 ∙ x − 6.847 for a line-diagram
>>> y= 2.286 ∙ x + 0.01 for an XY-diagram
>>>
>>> The wrong equitation in the line-diagram is due
to the fact, that 
>>> for line diagrams not the real x-values but the
values 1, 2, … are 
>>> used.
>>>
>>> The question is now, whether the regression
curve should do so and 
>>> it is enough, if we tell it to the user. Or
should the calculation 
>>> use the x-values of the data series, if they
provide a datatyp to 
>>> calculate with?
>>
>> The problem here is that there is this fundamental
difference in line 
>> charts and scatter charts. A line chart uses
categories (which are 
>> strings) and has equidistant data points. The
"x-values" in this 
>> example are names for categories for all data
series of a line chart. 
>> If you want them to be x-values, you have to chose
scatter as 
>> chart-type.
>
> I know that, but I see often users simple click type
"line" to get a 
> chart with lines, with all the problems.
>
>>
>> The data series does not know any x-values in a
line chart
>
> Why not? The data range is known so it should be
possible to look 
> whether there are numbers.
>
> , so neither
>> does the regression curve. In addition your example
works only 
>> because the x-values are by chance equidistant. If
they weren't, a 
>> line chart would show the data points still
equidistant. A regression 
>> curve using the non-equidistant x-values would
simply be wrong (the 
>> graphs would not fit).
>
> The equitations are wrong in the most cases. The
question is, where 
> the user should be told, that the shown equitation
doesn't fit to 
> their data series numbers.
>
>>
>> The only chance I see for this dilemma is to guide
the user somehow 
>> that he should use a scatter chart when the
"categories" are numbers.
>
> I can suggest some text for the online help (would be
issue 77929, 
> help file /text/schart/01/04050100.xhp), that's no
problem. But will a 
> user with small mathematical knowledge notice, that
something is wrong 
> and then look into the help? A warning when creating
the chart would 
> be more helpful.
>
>  In
>> Excel there is an automatism that uses scatter
charts even if you 
>> select a line chart type, when the categories are
all numbers.
>
> In my Excel97 and Excel2007 there is no such
automatism. Excel creates 
> the line chart without warning and shows the
"wrong" equitation in the 
> chart.
>
>  However, I would
>> prefer to make this clear to the user.
>
> Shall I write an issue for displaying a warning, when
creating the 
> chart or do you think it is enough to add an
explanation to the help?
>
> kind regards
> Regina
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: dev-unsubscribegraphics.openoffice.org
> For additional commands, e-mail: dev-helpgraphics.openoffice.org
>

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribegraphics.openoffice.org
For additional commands, e-mail: dev-helpgraphics.openoffice.org


[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )