강의 홍보
1줄 요약
Sample Dataset
- Sample Data를 가져오는 코드를 작성합니다.
- 이 때
PyDataset 라이브러리를 활용합니다.
Collecting pydataset
[?25l Downloading https://files.pythonhosted.org/packages/4f/15/548792a1bb9caf6a3affd61c64d306b08c63c8a5a49e2c2d931b67ec2108/pydataset-0.2.0.tar.gz (15.9MB)
[K |████████████████████████████████| 15.9MB 285kB/s
[?25hRequirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from pydataset) (1.1.5)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas->pydataset) (2.8.1)
Requirement already satisfied: numpy>=1.15.4 in /usr/local/lib/python3.7/dist-packages (from pandas->pydataset) (1.19.5)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas->pydataset) (2018.9)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas->pydataset) (1.15.0)
Building wheels for collected packages: pydataset
Building wheel for pydataset (setup.py) ... [?25l[?25hdone
Created wheel for pydataset: filename=pydataset-0.2.0-cp37-none-any.whl size=15939431 sha256=ebe470895a3467fe13c7654021e9108227a6dec8ce6da4f9b4e704520bcd6203
Stored in directory: /root/.cache/pip/wheels/fe/3f/dc/5d02ccc767317191b12d042dd920fcf3432fab74bc7978598b
Successfully built pydataset
Installing collected packages: pydataset
Successfully installed pydataset-0.2.0
from pydataset import data
print(data())
dataset_id title
0 AirPassengers Monthly Airline Passenger Numbers 1949-1960
1 BJsales Sales Data with Leading Indicator
2 BOD Biochemical Oxygen Demand
3 Formaldehyde Determination of Formaldehyde
4 HairEyeColor Hair and Eye Color of Statistics Students
.. ... ...
752 VerbAgg Verbal Aggression item responses
753 cake Breakage Angle of Chocolate Cakes
754 cbpp Contagious bovine pleuropneumonia
755 grouseticks Data on red grouse ticks from Elston et al. 2001
756 sleepstudy Reaction times in a sleep deprivation study
cake = data("cake")
print(cake)
data("cake", show_doc=True)
replicate recipe temperature angle temp
1 1 A 175 42 175
2 1 A 185 46 185
3 1 A 195 47 195
4 1 A 205 39 205
5 1 A 215 53 215
.. ... ... ... ... ...
266 15 C 185 28 185
267 15 C 195 25 195
268 15 C 205 25 205
269 15 C 215 31 215
270 15 C 225 25 225
cake
PyDataset Documentation (adopted from R Documentation. The displayed examples are in R)
## Breakage Angle of Chocolate Cakes
### Description
Data on the breakage angle of chocolate cakes made with three different
recipes and baked at six different temperatures. This is a split-plot design
with the recipes being whole-units and the different temperatures being
applied to sub-units (within replicates). The experimental notes suggest that
the replicate numbering represents temporal ordering.
### Format
A data frame with 270 observations on the following 5 variables.
`replicate`
a factor with levels `1` to `15`
`recipe`
a factor with levels `A`, `B` and `C`
`temperature`
an ordered factor with levels `175` < `185` < `195` < `205` < `215` < `225`
`angle`
a numeric vector giving the angle at which the cake broke.
`temp`
numeric value of the baking temperature (degrees F).
### Details
The `replicate` factor is nested within the `recipe` factor, and `temperature`
is nested within `replicate`.
### Source
Original data were presented in Cook (1938), and reported in Cochran and Cox
(1957, p. 300). Also cited in Lee, Nelder and Pawitan (2006).
### References
Cook, F. E. (1938) _Chocolate cake, I. Optimum baking temperature_. Master's
Thesis, Iowa State College.
Cochran, W. G., and Cox, G. M. (1957) _Experimental designs_, 2nd Ed. New
York, John Wiley \& Sons.
Lee, Y., Nelder, J. A., and Pawitan, Y. (2006) _Generalized linear models with
random effects. Unified analysis via H-likelihood_. Boca Raton, Chapman and
Hall/CRC.
### Examples
str(cake)
## 'temp' is continuous, 'temperature' an ordered factor with 6 levels
(fm1 <- lmer(angle ~ recipe * temperature + (1|recipe:replicate), cake, REML= FALSE))
(fm2 <- lmer(angle ~ recipe + temperature + (1|recipe:replicate), cake, REML= FALSE))
(fm3 <- lmer(angle ~ recipe + temp + (1|recipe:replicate), cake, REML= FALSE))
## and now "choose" :
anova(fm3, fm2, fm1)