Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Proc mixed, defining data structure for desired comparison (Random...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-15-2023 12:01 PM
(2482 views)

Hello,

Exp details: 24 pens, 3 feed types (800, 400, 100), 8 replicates, RCBD

The purpose is to evaluate a preference for feed type (800, 400, 100).

To make the comparisons, three "treatments" were created (A = 800 vs 100, B= 800 vs 400, C =400 vs 100), the A,B, C treatments were used to allocate 2 of the 3 feeds to 1 pen in a RCBD.

For 5 days, a known amount of feed was added to each feeder/pen, and after 24 h the feed was weighed back to determine the feed disappearance. Preference was defined as ((disappearance of ___ feed (800, 400, 100; respectively))/ Total feed disappearance (combined total of the 2 feeders/pen)*100).

Giving the proportion of of each feed /pen/day.

In the proposal (was proved to me...), the Statistical Analysis is suppose to be "...preference test as the fixed effect and replicate as the random effect. LSmean and means are to be separated using pdiff with Tukey's adjustment with pen as the EU." IMHO this somewhat lacking and not very descriptive.

CODE:

input

PEN | DIET | Treatment | Rep | Feed | pref1 | pref2 | pref3 | pref4 | pref5 |

1 | 1 | 3 | 1 | 800 | 68.7500 | 57.6471 | 67.8571 | 57.9545 | 78.9157 |

1 | 3 | 3 | 1 | 100 | 31.2500 | 42.3529 | 32.1429 | 42.0455 | 21.0843 |

2 | 1 | 1 | 1 | 800 | 79.7101 | 44.7368 | 44.4444 | 41.3043 | 48.3333 |

2 | 2 | 1 | 1 | 400 | 20.2899 | 55.2632 | 55.5556 | 58.6957 | 51.6667 |

3 | 2 | 2 | 1 | 400 | 100.0000 | 67.5676 | 85.1852 | 80.5825 | 56.0847 |

3 | 3 | 2 | 1 | 100 | 0.0000 | 32.4324 | 14.8148 | 19.4175 | 43.9153 |

4 | 1 | 1 | 2 | 800 | 92.5926 | 57.7465 | 65.5172 | 73.9130 | 59.0278 |

4 | 2 | 1 | 2 | 400 | 7.4074 | 42.2535 | 34.4828 | 26.0870 | 40.9722 |

....

;

Run;

Proc Print;

Run;

%macro model_loop;

%let yvar1 = pref1;

%let yvar2 = pref2;

%let yvar3 = pref3;

%let yvar4 = pref4;

%let yvar5 = pref5;

%do i=1 %to 5 ;

Proc Mixed data=Exp200 NOBOUND ASYCOV method=REML;

Title "&&yvar&i";

class Pen Trt Rep Feed;

model &&yvar&i = Feed/ddfm=Kenwardroger;

random Rep / subject=Pen;

lsmeans Feed/pdiff adjust=tukey;

store out=work.glm&&yvar&i;

run;

proc plm restore=work.glm&&yvar&i;

Title "&&yvar&i PLM";

lsmeans Feed/ pdiff lines adjust=tukey;

run;

%end;

%mend model_loop;

%model_loop;

quit;

The issue I'm having is a failure to converge . I have a feeling that it has to do with the specification of the subject.

I guess I don't have a super specific question. I just don't feel like this is the correct way to analyze the data and looking for advice.

Regards,

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Rather than a ratio of each component to the total you might consider a ratio of consumption of the feed with the larger feed level to the smaller feed level for each pen. You get a single value that is indexed by the variable 'Treatment'. Or consider the difference in consumption calculated in much the same way. That might even be better as it is much more likely to collectively have residuals that are distributed normally. The lsmeans by treatment are then a measure of how much more was consumed of the higher valued feed - which is a clear measure of preference.

SteveDenham

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

It seems to me that for every Pen, you have two observations. The values for these two observations for each of Pref1-Pref5 always add up to 100. That might be a problem. You might want to reconsider how you want to arrange your data and set up your model.

Thanks,

Jill

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Jill,

Thank you for the response. I also think that is the issue.

Since these values are percentages that add to 100% (when combined), they aren't independent.

Any idea or suggestion on how I can re-arrange?

Regards,

Kyle

Thank you for the response. I also think that is the issue.

Since these values are percentages that add to 100% (when combined), they aren't independent.

Any idea or suggestion on how I can re-arrange?

Regards,

Kyle

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

One approach might be to just use one observation for each Pen.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you again for the response. How might only using 1 or the observations impact the interpretation or the results?

Regards,

Regards,

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I hope I understand this correctly. Each pen is randomly assigned to one of 3 treatments (A, B, C). Thus, a pen will have one value. I don't see the advantage of calculating a value for each as a proportion. Have you considered fitting the raw consumption values (which should be independent), and then calculating the ratios from the least squares means?

The other point to consider is that ratios probably do not yield normally distributed residuals, as the cut-offs at 0 and 1 may be a problem. If you are going to analyze the proportions as the dependent variable, a beta distribution for the values may be more appropriate. That would mean shifting to GLIMMIX or GEE.

And last, failure to converge. That very likely can be fixed by adding MAXITER= and MAXFUNC= options to the PROC MIXED statement, provided there are no other issues. Check both the .log and .lst file for notes, warnings and notifications that indicate that the algorithm has difficulties other than slow convergence.

SteveDenham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello @stevendenham,

Thank you for your response!

Yes, you are correct. There are 3 pens/block (i used the word "Rep" in my initial post) with 8 blocks in total.

The reason I have two values for each pen is due to the different consumptions from the two feeders in each pen. If I were to use only 1 value from the pen, I am unsure how I would differentiate the consumption between the two feeders within the pen (800 v 100 etc.).

To your point on proportions, I also considered this and agree especially with a handful of cases were the consumption for the feeders within a pen are 0 and 1, respectively. Admittedly, I am a bit unfamiliar with GLIMMIX and GEE, but should the code look something like this:

Proc GLIMMIX data=Exp200 ;

class Pen Trt Block Feed;

model y= Feed / ddfm=KR;

random block/ subject=Pen; **Unsure if " /subject=pen" is necessary**

lsmeans Feed/pdiff lines adjust=tukey;

run;

As for the failure to converge, I was able to get it to converge (with no errors) by removing the NOBOUND option, but the Cov Parm Estimate for Block/subject=pen was zero which is odd.

Regards,

Kyle

Thank you for your response!

Yes, you are correct. There are 3 pens/block (i used the word "Rep" in my initial post) with 8 blocks in total.

The reason I have two values for each pen is due to the different consumptions from the two feeders in each pen. If I were to use only 1 value from the pen, I am unsure how I would differentiate the consumption between the two feeders within the pen (800 v 100 etc.).

To your point on proportions, I also considered this and agree especially with a handful of cases were the consumption for the feeders within a pen are 0 and 1, respectively. Admittedly, I am a bit unfamiliar with GLIMMIX and GEE, but should the code look something like this:

Proc GLIMMIX data=Exp200 ;

class Pen Trt Block Feed;

model y= Feed / ddfm=KR;

random block/ subject=Pen; **Unsure if " /subject=pen" is necessary**

lsmeans Feed/pdiff lines adjust=tukey;

run;

As for the failure to converge, I was able to get it to converge (with no errors) by removing the NOBOUND option, but the Cov Parm Estimate for Block/subject=pen was zero which is odd.

Regards,

Kyle

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

~~hi Kyle @Laser_Taco_ ,~~

~~Just a few things that occurred to me over the weekend. In GLIMMIX, for ratios you should probably specify a distribution, and change the method to LAPLACE with tech=nrridg. So perhaps something like this:~~

`Proc GLIMMIX data=Long_version method=laplace;`

nloptions maxiter=1000 tech=nrridg; class Pen Trt diet Block Feed instance; model y=Trt Feed instance Feed*instance/dist=beta ddfm=KR; random instance/subject=pen*diet type=ar(1); random feed/subject=pen type=chol; random block; lsmeans Feed/pdiff lines adjust=tukey; run;

~~The first thing to do is to re-arrange the data into a long format, where there is one dependent value in each observation (here it is y). The other variables in each line would be pen, trt, diet (although this looks completely confounded with feed), block, feed and instance, which is indexed from 1 to 5. So instead of:~~

~~You would have (for the first pen, and all other pens would follow this pattern:~~

~~ And so on. Note that all of the y values have been rescaled to the (0, 1) interval. So now there are two levels of feed for each pen and five levels of instance for each pen by diet combination.~~

~~If I were better informed about how to use it, I might recommend PROC COPULA for the analysis, but that is only based on what the examples look like.~~

I thought I had a way to address this, but so long as each pref within a treatment is such that the values for the two diets within a treatment sum to 1, all of the above will have some major issues due to lack of independence. If the preference response was converted to a ratio or some other type of value, so that a single value is obtained per pen at each preference measurement (time), you should be able to improve the analysis.

SteveDenham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @SteveDenham ,

I'm not sure if I understand what you mean by "converted to a ratio or some other type of value, so that a single value is obtained per pen at each preference measurement (time), you should be able to improve the analysis.".

I understand that having multiple responses from the same pen is my issue here. For converting the values to a ratio, each value per diet/pen is already a ratio == Diet__ :Total Consumed.

Regards,

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Rather than a ratio of each component to the total you might consider a ratio of consumption of the feed with the larger feed level to the smaller feed level for each pen. You get a single value that is indexed by the variable 'Treatment'. Or consider the difference in consumption calculated in much the same way. That might even be better as it is much more likely to collectively have residuals that are distributed normally. The lsmeans by treatment are then a measure of how much more was consumed of the higher valued feed - which is a clear measure of preference.

SteveDenham

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.