Due to the amount of traffic lately on the nmusers list regarding
the modeling of log-transformed data, I thought a good topic
for this week's tip would be modeling log-transformed data.
I will leave the relative merits of modeling transformed data
to previous posts on the list and the modelers discretion.
In its simplest form, modeling log-transformed data consists
of adding a column with the logarithm of the dependent variable
to your original data set and using an appropriate logarithmic
expression of the error model.
In the case where one wants to approximate a proportional
residual error structure (as applied to non-transformed data)
to the transformed data, one would use an expression like
the following for the residual error: Y=LOG(F)+ERR(1)
There has been much discussion (Chuanpu Hu & Leonid Gibiansky,
primarily) about how one could approximate the additive and
proportional residual error model often found to be useful
with non-transformed data. Lewis Sheiner suggested a model
published by Stuart Beal (JPP 28(5), 2001, 481-504). This
model has been referred to (elsewhere) as "the double exponential
error model" and can be coded as follows:
Y=LOG((F)+THETA(3))+(F*ERR(1))/(F+THETA(3))+(THETA(3)*ERR(2))/(F+THETA(3))
I would caution any modeler to first read this paper and be
aware of the characteristics of this model before implementing
it. For your amusement and amazement, I include here a full
control stream implementing log-transformed data and the "the
double exponential error model":
;Model Desc: base model using log-transformed data and Beal
error model
;Project Name: example1
;Project ID: GM00-001
$PROB RUN# 105 (PHENOBARBITAL POPULATION PK MODEL)
$INPUT C ID TIME AMT WT APGR XDV EVID MDV DV
$DATA LN002.CSV IGNORE=C
$SUBROUTINES ADVAN1
$PK TVCL=THETA(1) CL=TVCL*EXP(ETA(1)) TVV=THETA(2) V=TVV*EXP(ETA(2)) K=CL/V S1=V TAD=TIME OBS=XDV
;non-transformed observations
$ERROR
PRD=F ;individual non-transformed prediction (with POSTHOC
or FOCE)
;otherwise, the population non-transformed prediction
;Beal, JPP 28(5), 2001, p.488
Y=LOG((F)+THETA(3))+(F*ERR(1))/(F+THETA(3))+(THETA(3)*ERR(2))/(F+THETA(3))
IF(F.GT.0) THEN
IPRED=LOG(F) ;the log-transformed individual prediction
ELSE
IPRED=0
ENDIF
$THETA (0,
1) ;CL (0,
5) ;V (0,
0.1) ;"m, a positively constrained variance parameter"
$OMEGA 0.16
;[P] inter-individual variability in CL 0.16
;[P] inter-individual variability in V
$SIGMA 0.1
;e1, random error term 0.1
;e2, random error term
$EST MAXEVAL=9999 PRINT=20 NOABORT METHOD=1 INTERACTION MSF=105.MSF
$COV
$TABLE ID TIME TAD IPRED OBS PRD NOPRINT ONEHEADER FILE=105.TAB
[errors may be reported to the author and are welcomed.]
[Generic Disclaimer: Verify that this code or any other code
you receive from an outside source works with YOUR DATA.]