CorNinaber.nl

Check generated data

Had a problem with my data generation. The idea is to create an multinomial outcome, and some predictors. Here P is the probability matrix of a case being assigned to a specific class, and F is the dummy coded class that is assigned.

It did work before when I ran the test, but it seems that when the number of cases generated is increased, the class probability that a case has deteriorates. Here you find a small script I used to test this and the resulting picture. (as a side note I use the gen.dat function to generate the data)

## create a list object for the output
test.N=list()
## an exponential sequence of different number of cases.
N.seq=unique(round(exp(seq(log(5),log(10000),length.out=50))))
## Run over the sequence
for(n in N.seq){
## generate the data
temp.dat <- gen.dat(N=n, to.file=FALSE)
## extract the relevant part
## (P is the class probability, F is the class indicator matrix)
test.N[[which(n==N.seq)]] <- temp.dat$P[which(temp.dat$F==1)]
}
## make a boxplot of the result boxplot(test.N,names=N.seq,xlab=”N”,ylab=”P(F==1)”)

Next Post

Previous Post

Leave a Reply

© 2021 CorNinaber.nl

Theme by Anders Norén