Home page


URL: http://www.agnld.uni-potsdam.de/~shw/Lehre/lehrangebot/2009SS/R.html

Sommersemester 2009

R-Skripts zur Modellierung und Datenanalyse komplexer Systeme

When you asked alumni graduates from universities in Europe and US moving into nonacademic jobs in society and industry what they actually need in their business, you found that most of them did stochastic things like time series analysis, data processing etc., but that had never appeared in detail in university courses.
Chorin & Hald: Stochastic tools in math and sciences, Honerkamp: Stochastische dynamische Systeme, or Mahnke et al.: Physics of stochastic processes, p. XV

VVVVVVVVVVVVVVVVV 1.VL Motivation und Einfuehrung VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
#### Uebung 1 ########### Karsten Ahnert April 24 #################################################################

R --no-save < commands.R

?c, ?read.table  Hilfe zu c() oder read.table
help.search("c")  sucht in der Hilfe nach c
find("c") liefert das package in dem c sich befindet
quit(), example("c")
library(mgcv)  laedt die bibliothek mgcv
library(help=mgcv)  liefert infos zu mgcv, welches objekte,
funktionen etc.
install.packages("akima")  installiert das package akima

objects(), ls()  zeigen an welche Objekte erstellt wurden
search()  zeigt den Suchfpfad an, insbesondere welche Libs geladen wurden
str(x), strucure(x)  zeigen die Struktur eines Objektes an
summary(x)  gibt statistische Zusammenfassung des Datensatzes x aus
rm(x) loescht x aus dem Speicher
save(x,file="file.dat") schreibt Daten x in die Datei file
save.image(file="file") schreibt den „R kern, d.h. alle Daten usw. in die Datei file.Rdata, default Datei ist .Rdata
load("file.dat")  laedt die Daten wieder in den Speicher
Alles in R sind Objekte
Jedes Objekt hat eine Laenge und einen Typ (length(), mode())
z.B. a<-1:10 erzeugt 1,2,...,10, der type ist numeric und die Laenge 10
Zahlen: 1.2e3, 3.9+4.5i, Inf, NaN
NA  not available, undefiniert, z.B. a <- 1 ; length(a) <- 100
Funktionsobjekte, Ausdrucksobjekte, . . .
Datenreihen koennen beschriftet werden, names, row.names
Klassen koennen erstellt werden
Levels (Faktorlevel)
Dimension fuer Matrix, Arrays, oder allgemein Tensoren
attributes(a)  zeigt alle Attribute an
attr(a,"xyz")  zeigt spezielles Attribut
attr(a,"xyz")<-100  setzt ein spezielles Attribut
Jedes Objekt hat eine Laenge, das i.te Element kann mit a[i] gesetzt werden
a<-numeric(250) erzeugt numerischen Vektor mit 250 Eintraegen
c()  kombiniert objekte
length(x)  gibt die Laenge eines objektes zurueck
min(x), max(x), range(x)
sort(x)  sortiert einen Vektor
y <- 2*x  multipliziert jedes Element aus x mit 2
y <- sin(x)  berechnet von jedem Element aus x den Sinus
y <- x1+x2  summiert die beiden Vektoren x1 und x2, falls x1 und x2 unterschiedliche Laenge haben wird der kuerzere Vektor zyklisch wiederholt
x<-1:10  erzeugt den Vektor 1,2...,10
x<-seq(1,10)  erzeugt den Vektor 1,2...,10
x<-seq(1,10,0.1)  erzeugt den Vektor 1,1.1,1.2,1.3..,9.9,10
x<-seq(1,10.01,0.1)  erzeugt den Vektor
1,1.1,1.2,1.3..,9.9,10,10.0
x<-seq(1,10,len=5)  erzeugt 1,3.25,5.5,7.75,10
x<-rep( 1:2 , times=5)  erzeugt 1,2,1,2,1,2,1,2,1,2
x<-rep( 1:2 , each=5)  erzeugt 1,1,1,1,1,2,2,2,2,2
x<-rep( c(1,2,3,4) , c(1,2,3,4) )  erzeugt 1,2,2,3,3,3,4,4,4,4
Ähnlich zu Vektoren, allerdings koennen die Typen in einer Liste unterschiedlich sein
l <- list( 1 , 2 , 3 )
Ausserdem koennen die einzelnen Elemente gelabelt werden
l <- list( eins = 1 , zwei = 2 , drei = 1:10 ) Dann kann man die Elemente mit l$eins abrufen/ansprechen
Dataframes sind Listen der Klasse data.frame
Vektorstrukturen in einem Dataframe muessen die gleiche Laenge haben werden von d <- data.frame( eins=c(1:10) , zwei=c(1:10) ) erzeugt
Dataframes werden von read.table() erzeugt
attach(bacteria)  haengt einen dataframe an, d.h. die komponenten koennen direkt angesprochen werden
detach(bacteria)  loescht detaches the dataframe bacteria, d.h. die komponenten koennen nicht direkt angesprochen werden

Einlesen auf Daten
read.table("file")  liest daten aus file ein und erzeugt
einen Data Frame
Zeilenlabel und Spaltenlabel werden versucht zu lesen
Einzelen Spalten und Zeilen haben Namen
data.dat:
Zeit x y
1 1 1 1
2 2 2 1.5
3 3.5 2.5 1.25
inp <- read.table("data.dat")
inp <- scan("data.dat", list(0,"",0))

Definiere eine Funktion
my_pow <- function(x,i)
{
if( i<=0 ) 1.0
else x*my_pow(x,i-1)
}

Schreiben Sie Funktionen
y=x*x
y=sqrt(x)
y=sqrt(x1^2+x2^2)
y=arctan(x1/x2)

Schleifen sollten vermieden werden, sehr langsam.
Allerdings sind die manchmal unabdingbar.
Beispiel, die Zeitreihe der logistischen Abbildung -> Chaos xn+1 = 4xn(1 􀀀 xn)
n xn xn+1
1 0.5 0.4375
2 0.4375 0.984375
. . .
a<-numeric(250)
a[1] = 0.125
for( i in 2:250 ) a[i] = 4.0*a[i-1]*(1.0-a[i-1])

Ein Random Walk, d.h. xi+1 = xi + r , wobei r weisses Rauschen ist.
r sei normalverteilt, rnorm(100) erzeugt 100 normalverteilte Zufallszahlen
Plotten des Prozesses

#### Uebung 1 ### ENDE ## Karsten Ahnert April 24 #################################################################

VVVVVVVVVVVVVVVVV 2.VL Zufallsvariable, Erwartungswert, Momente, charakteristische Funktion, Kummulanten VVVVVVV
#### Uebung 2 ########### Karsten Ahnert Mai    4 #################################################################

1. R-Kommandos:

x<-read.table("dataframe")
x<-read.table("lotto.dat")
Grundgesamtheit (Klasse, Population) Merkmalsträger (Untersuchungseinheit, Erhebungseinheit, Unit) 
Merkmal (Statistische Variable, Item, Instanz) Merkmalsausprägung (Wert der Variable)

tmp<-c(x$V1,x$V2,x$V3,x$V4,x$V5,x$V6)
mean(tmp)
dim(tmp)<-c(2777,6)

for(i in 1:2777){tmp[i,]=sort(tmp[i,])}

Sortierdauertest:
a<-numeric(20000)
 b<-sqrt(a)
c<-numeric(20000)
for(i in 1:20000) c[i]=sqrt(a[i])

logisches UND auf Vektor
all(tmp[1,]==tmp[2,])
all(tmp) logisches und

all(c(TRUE,FALSE))

any = log oder-Operation auf Vektor
any(tmp[1,]==tmp[2,])

for(i in 1:2776)
 for(j in (i+1):2777)
        all(tmp[i,]==tmp[j,])

tmp[1,][tmp[1,]==tmp[2,]] # TRUE in erster und 2. Ziehung gleiche Zahlen 3 12
length(tmp[1,][tmp[1,]==tmp[2,]] == 6 ) # 2
length( tmp[1,][tmp[1,]==tmp[2,]] ) == 6 # FALSE


2. Quantile

q_r Wert bis zu dem ein Anteil von r Werten liegt.

Ein Vergleich zweier Stichproben/Merkmale x,y mittels q-q-Plot dient dem Vergleich der unbekannten Verteilungen 
die den Zufallsvariablen/Merkmale X, Y zugrunde liegen.
Erlaubt raschen Residuen-Vergleich.
qqplot(x,y)
x<-rnorm(1000)
y<-x*x
y<-x*x*x
y<-x+23
qqplot(x,sin(x))

hist(x)
qqplot(x,y)

hist(x,freq=FALSE)  # Gibt statt der Anzahl pro Bin die geschaetzte Dichte von X fuer die Stichprobe x an
points(seq(-10,10,.1),dnorm(seq(-10,10,0.1))) # plottet die Dichte der Standard-Normalverteilung drueber

#### Uebung 2 ## ENDE ### Karsten Ahnert Mai 4 #################################################################

VV 3.VL Verteilungen von Funktionen von Zufallsvariablen, Schaetzung von Momenten & Verteilungen, Tests, Regression VVVVVVVV

#### Uebung 3 ########### Karsten Ahnert Mai 11 #################################################################

date -u1234567
date -u -d 01234567890

dataclean
x<-seq(1,10)
y<-x[x<6]

tmp<-read.table("distribution.dat")
x<-c(tmp$V1)
hist(x)
z<-x[x<30]
hist(z)



 rpois(10,lambda=1)
plot(dpois(seq(0,10,1),lambda=1))
xx<-seq(0,1,0.025)
plot(xx,qcauchy(xx))



#### Uebung 3 ## ENDE ### Karsten Ahnert Mai 11 ################################################################

VV 4.VL Regression, Maximum-Likehood-Schaetzung, Chi^2, SVD, Levenberg-Marquardt. Press et al. Kap 15 VVVVVVVV

#### Uebung 4 ########### Karsten Ahnert Mai 18 #################################################################

Aufgaben von Blatt 3 gerechnet: Varianzschaetzer-Konsistenz, F-Verteilung.

data<-rnorm(10E5)
num_of_samples = 100
sample_length=10
means<- replicate(num_of_samples, mean(sample( data,sample_length)))
vars <- ...

h10<-hist( mean_10, breaks=num_of_breaks,plot=FALSE)

plot(h10$mids,h10$density,type="o",ylim=c(0,6), xla="mean", ylab"P(mean)")


len<-100
data<-read.table("distribution3.dat")$V1

mu=mean(data)
sigma=sd(data)/sqrt(len)
konfidenz=c(0.7,0.8,0.9,0.95)
quants=(1.0+konfidenz)/ 2.
quants
diff=qnorm,mean=0, sd=sigma)
diff
for (i in 1:length(diff)) {
print (c(mu,mu-diff[i],mu+diff[i])
}

hist data,freq=FALSE)
points(c(mu-diff[1],mu-diff[1] c(0,0.2), type="o")
points(c(mu+diff[1],mu+diff[1] c(0,0.2), type="o")

Polynom an Daten fitten

len=100
x<-rnorm(len)
y<-x*x+3.5*x
plot(x,y)

reg<-lm(y~x)
abline(reg)
str(reg)


reg<-lm(y~I(x)+I(x*x))
plot(x,y)
lines(sort(x),reg$fitted.values[order(x)], col="red",lw=4)

reg<-lm(y~poly(x,degree=3))

#### Uebung 4 ## ENDE ### Karsten Ahnert Mai 18   ##############################################################

VV 5.VL Kalman-Filter a la Gerschenfeld VVVVVVVV Mai 25 VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV

VV 6.VL Kalman-Filter, Splines, Cluster-Ananlyse VVVV Mai 28 VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV

#### Uebung 5 ########### Karsten Ahnert Juni 8 #################################################################



usr/cnld/shw/R/library/PlotStat1.R
##############################################################################
#
# R Intro Udo Schwarz. March 9, 2009
#
 /usr/cnld/shw/TeX/introductions/Praktikum+Exercise/2009SS-ComplexSystems/R
# demo("graphics")
#
# lpr -P kubo PlotStat1.R
#
##############################################################################
# Start R, quit q("yes"), show used objects or variables objects() or  ls()
# remove objects using rm()
# http://cran.r-project.org/doc/manuals/R-intro.pdf
# http://stat.ethz.ch/R-manual/R-patched/library/stats/html/
# http://research.stowers-institute.org/efg/R/Graphics/Basics/mar-oma/index.htm
# http://research.stowers-institute.org/efg/R/Graphics/Basics/mar-oma/mar-oma.R
# http://www1.maths.lth.se/help/R/plot.histogram/
# Appendix A:  A sample session starting at p. 78
##############################################################################
# run a script by source("PlotStat1.R")
# Rscript foo.R arg1 arg2
# http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_scrpt.html
#
N=1000                          # sample size
dt<-.367                        # sample time
x<-(1:N)*dt                     # time axis x<-seq(1,N,dt)
cat("Sample time = ",dt,"\n")   # Ausgabe in Konsole und Datein
# paste(cc," = ", dt) Zusammensetzen von Zeichenketten
cat("Sample frequency = ",1./dt,"\n")
cat("Nyquist frequency = ",0.5/dt,"\n")
period_length=2*pi
cat("Period length = ",period_length,"\n")
cat("Frequency  = ",1./period_length,"\n")
y<-sin(2*pi*x/period_length+2*pi*runif(1))              # harmonic function
r<-rnorm(N)                     # realization of Gaussian noise
#
# Multi-plot figures with different numbers of plots indifferent rows
# http://tolstoy.newcastle.edu.au/R/help/05/03/1559.html
# multi-pannel plots Trellis at Crawley p. 297 or xyplot
# coplot() and pairs() dotchart(y)
rows=3
cols=3
par(mfrow=c(rows,cols)) #set up 3 rows x 3 columns 
N0=1
Nend=100
plot(x[N0:Nend],y[N0:Nend],type="o",main="Harmonic process Y",xlab="time in secs",ylab="Y(t)")
# plot(x,y,log="y")
plot(x[N0:Nend],r[N0:Nend],type="l",main="Gaussian distributed noise R",xlab="time in secs")
yr<-y+sqrt(.08)*r
plot(x[N0:Nend],yr[N0:Nend],type="b",main="noise harmonic process",xlab="time [secs]")
# plot a estimated naive density using stem(r) or hist(r)
hist(y)

# Parameter extracting from functions
# y<-hist(r) dd<-y$density  plot(dd)
# plot(hist(r)$density,type="b")
# plot.histogram

Hist<-hist(r,plot=FALSE)
H<-Hist$density
X<-Hist$mids
plot(X,H,type="p")
title(main = list("Histogram of Gaussian noise", cex=1.5, col="red", font=3))
# ,sub="dashed line: theory")
#points(x, cex = .5, col = "dark red")

# theoretical Gaussian 
# plot(function(x) dnorm(x, log=TRUE),-2,2,main = "log { Normal density }")
# plot(function(x) dnorm(x),-2,2,lty="dashed")
lines(X,dnorm(X,mean = 0, sd = 1),pch=3, lty=3, col = "blue")
#lines(X,dnorm(X,mean = 0, sd = 1),lty="dashed")
legend(-1,0.2, "theoretical Gaussian", lty=3, col = "blue")
# legend(-1,0.2, "theoretical Gaussian", pt.bg="white", lty=3, col = "blue")

# Shannon entropy
Entropy<- -sum(log(H)*H)
# Format controled print command cat
cat("Shannon entropy = ",Entropy,"\n")

cat("correlation = ",var(r,y),"\n")
cat("Kendall correlation = ",cor(r,y,method = c("kendall")),"\n")
# var(x, y = NULL, na.rm = FALSE, use)
# cov(x, y=NULL,use="all.obs",method=c("pearson","kendall", "spearman"))

hist(yr,type="s")

# Auto- and Cross- Covariance and -Correlation Function Estimation
# acf(x, lag.max = NULL, type = c("correlation","covariance","partial"), 
# plot = TRUE, na.action = na.fail, demean = TRUE, ...)


ACF<-acf(y, plot = FALSE)
A<-ACF$acf
A_noise<-A*0+1.96/sqrt(N)
TAU<-(ACF$lag)*dt
plot(TAU,A,type="o",main="ACF of a harmonic process")
lines(TAU,A_noise,lty="dashed")
lines(TAU,-A_noise,lty="dashed")
# Greek symbols xlab=substitute(paste(phi,"=",true,sigma),list(true=5))
acf(r,xlab=expression(tau),main = "auto-correlation of Gaussian noise")
acf(yr, xlab=expression(tau),ylab=expression(gamma) )

# help(Devices)
# postscript("plot1.eps", horizontal=FALSE, onefile=FALSE, height=8, width=6, pointsize=10)
# postscript() X11() windows() pdf() jpeg()
# dev.off()
# dev.list() dev.next() dev.prev() dev.cur()
#
# pdf("PlotStat1.R.pdf", onefile=FALSE, height=8, width=6, pointsize=10)
# dev.off()


/usr/cnld/shw/R/library/Plot0.R
## point covering line :
## http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/legend.html
par(mfrow=c(1,1))
y <- sin(3*pi*x)
plot(x, y, type="l", col="blue",
    main = "points with bg & legend(*, pt.bg)")
points(x, y, pch=21, bg="white")
legend(.4,1, "sin(c x)", pch=21, pt.bg="white", lty=1, col = "blue")

/usr/cnld/shw/R/librar/Plot1.R

## Run the example in '?matplot' or the following:
par(mfrow=c(1,1))
leg.txt <- c("Setosa     Petals", "Setosa     Sepals",
             "Versicolor Petals", "Versicolor Sepals")
y.leg <- c(4.5, 3, 2.1, 1.4, .7)
cexv  <- c(1.2, 1, 4/5, 2/3, 1/2)
matplot(c(1,8), c(0,4.5), type = "n", xlab = "Length", ylab = "Width",
        main = "Petal and Sepal Dimensions in Iris Blossoms")
for (i in seq(cexv)) {
  text  (1, y.leg[i]-.1, paste("cex=",formatC(cexv[i])), cex=.8, adj = 0)
  legend(3, y.leg[i], leg.txt, pch = "sSvV", col = c(1, 3), cex = cexv[i])
}

/usr/cnld/shw/R/librar/Plot2.R

## 'merge = TRUE' for merging lines & points:
par(mfrow=c(1,1))
x <- seq(-pi, pi, len = 65)
plot(x, sin(x), type = "l", ylim = c(-1.2, 1.8), col = 3, lty = 2)
points(x, cos(x), pch = 3, col = 4)
lines(x, tan(x), type = "b", lty = 1, pch = 4, col = 6)
title("legend(..., lty = c(2, -1, 1), pch = c(-1,3,4), merge = TRUE)",
      cex.main = 1.1)
legend(-1, 1.9, c("sin", "cos", "tan"), col = c(3,4,6),
       text.col = "green4", lty = c(2, -1, 1), pch = c(-1, 3, 4),
       merge = TRUE, bg = 'gray90')

 
plot.histogram.R


/usr/cnld/shw/R/library/ReadPlot1.R
##############################################################################
#
# R Intro Udo Schwarz. March 10, 2009
#
# /usr/cnld/shw/TeX/introductions/Praktikum+Exercise/2009SS-ComplexSystems/R
#
# lpr -P kubo ReadPlot1.R
#
##############################################################################

### Henning Rust                                     28.06.2007

rm()
par(mfrow=c(1,1))
rows=2
cols=2
par(mfrow=c(rows,cols)) #set up 3 rows x 3 columns 

## ASCII data file with 54058 lines

# #d m  yr    m3/s
# #arraysize=4x54056
#  1 11 1850  316
#  2 11 1850  300
#  3 11 1850  284

## rows and columns dat[1:9,1:4]
# vi & ex-commands s/.$//g
# 1.Zeichen ersetzen                      s/^./text/
# letztes Zeichen ersetzen                s/.$/text/
# Einf"ugen vor dem ersten Zeichen        s/^/text/
# Einf"ugen nach dem letzten Zeichen      s/$/text/
# awk '{printf "%2.2f\t%2.2f\t%2.2f\t%2.2f\n",$1, $2, $3, $4}' DATA > data 
# zcat oder gunzip -s |  awk '{print $1*1.}' | gzip > name.gz
# cxx xlyap.cxx -lm -o xlyap.x
# limit datasize unlimited


### read the data file
# http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_datin.html
file <- "evs_assignments1.dat"                                 ## define file name
dat <- read.table(file,col.names=c("day","month","year","Q"))  ## read data

# extract multiple columns from a dataframe
#  test table generation
#  r1=c(1,2,3,7,1,3,2)
#  r2=c(4,5,7,8,1,4,3)
#  test=matrix(c(r1,r2),nrow=2,ncol=7,byrow=TRUE)
#  colnames(test)<-c("a1","a2","b1","b2","b3","c1","c2")
#  test
# grep("a",(colnames(test)))
# test.a<-test[,grep("a",(colnames(test)))]  ##? > test.dat

# Look at ?ts !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# ts(1:10, frequency = 4, start = c(1959, 2)) # 2nd Quarter of 1959
# print( ts(1:10, frequency = 7, start = c(12, 2)), calendar = TRUE)

dat.ts <- ts(dat$Q,start=c(305,1850),frequency=365)            ## convert to time series

### plot the time series
plot(dat.ts,ylab="discharge [m^3/s]",main="River Discharge")   ## plot ts


### histogram
hist(dat.ts,xlab="discharge [m^3/s]",main="River Discharge",sub="60 equally spaced breaks",xlim=c(0,2000),breaks=60,prob=FALSE)


### kernel estimates of the probability density
plot(density(dat$Q,bw=20,kernel="gaussian"),
     xlab="discharge [m^3/s]",main="River Discharge",sub="Kernel: Gaussian, sd=20",xlim=c(0,2000))


### estimate parameters of a log-normal distribution by methods of moments
log.dat <- log(dat.ts)                     ## log-transform
log.mu <- mean(log.dat)                    ## estimate mean of transformed data 
log.sd <- sd(log.dat)                      ## estimate standard deviation of transformed data
x <- seq(0,2500,length=500)                ## generate a sequence of quantiles
hist(dat$Q,xlab="discharge [m^3/s]",       ## plot a histogram
     main="River Discharge",sub="log-normal model",
     xlim=c(0,2000),breaks=60,prob=TRUE)
lines(x,dlnorm(x,log.mu,log.sd),col="red") ## add the log-normal distribution with estimated parameters

# pdf("ReadPlot1.R.pdf", onefile=FALSE, height=8, width=6, pointsize=10)
# dev.off()


/usr/cnld/shw/R/library/ReadPlot2.R

##############################################################################
#
# R Intro Udo Schwarz. March 12, 2009
#
# /usr/cnld/shw/TeX/introductions/Praktikum+Exercise/2009SS-ComplexSystems/R
#
# lpr -P kubo ReadPlot2.R
#
##############################################################################


rm()
par(mfrow=c(1,1))
rows=2
cols=2
par(mfrow=c(rows,cols)) #set up 3 rows x 3 columns 

## rows and columns dat[1:9,1:4]
# vi & ex-commands s/.$//g
# 1.Zeichen ersetzen                      s/^./text/
# letztes Zeichen ersetzen                s/.$/text/
# Einf"ugen vor dem ersten Zeichen        s/^/text/
# Einf"ugen nach dem letzten Zeichen      s/$/text/
# awk '{printf "%2.2f\t%2.2f\t%2.2f\t%2.2f\n",$1, $2, $3, $4}' DATA > data 
# zcat oder gunzip -s |  awk '{print $1*1.}' | gzip > name.gz
# cxx xlyap.cxx -lm -o xlyap.x
# limit datasize unlimited

## ASCII data file with 10003 lines

#       Time            LeftX           LeftY           RightX          RightY
#   0.0000000e+00   1.2448186e-01  -8.0063662e-01   3.5107922e-01  -7.5506732e-01
#   2.0000000e+00   1.3240186e-01  -7.2143662e-01   3.5107922e-01  -6.9170732e-01


### read the data file
# http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_datin.html
file <- "f01.001.dat"                                 ## define file name
dat <- read.table(file,col.names=c("Time","LeftX","LeftY","RightX","RightY"))  ## read data

# extract multiple columns from a dataframe
#  test table generation
#  r1=c(1,2,3,7,1,3,2)
#  r2=c(4,5,7,8,1,4,3)
#  test=matrix(c(r1,r2),nrow=2,ncol=7,byrow=TRUE)
#  colnames(test)<-c("a1","a2","b1","b2","b3","c1","c2")
#  test
# grep("a",(colnames(test)))
# test.a<-test[,grep("a",(colnames(test)))]  ##? > test.dat

# Look at ?ts !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
# ts(1:10, frequency = 4, start = c(1959, 2)) # 2nd Quarter of 1959
# print( ts(1:10, frequency = 7, start = c(12, 2)), calendar = TRUE)

dt=0.002        # sample time

dat.ts <- ts(dat$LeftX,start=1,frequency=1)            ## convert to time series
time.ts <- ts(dat$Time,start=1,frequency=1)            ## convert to time series
time<-time.ts*dt
pos<-diff(dat.ts)
Total<-length(pos)
Zeros<-length(pos[pos==0])
cat("#Zeros = ",Zeros,"\n")
cat(100*Zeros/Total,"% zeros in the data! \n")

### plot the time series
plot(time,pos,xlab="time [secs]",ylab="horizontal increment",main="Left eye",type="l")   ## plot ts


### histogram
hist(pos,xlab="horizontal increment",main="Left eye",breaks=100,prob=TRUE)


### kernel estimates of the probability density
plot(density(pos,bw=20,kernel="gaussian"),
     xlab="horizontal increment",main="Left eye",sub="Kernel: Gaussian, sd=20")


### estimate parameters of a log-normal distribution by methods of moments
#log.dat <- log(pos)                     ## log-transform
mu <- mean(pos)                    ## estimate mean of transformed data 
sd <- sd(pos)                      ## estimate standard deviation of transformed data

Hist<-hist(pos,plot=FALSE,breaks=50)
H<-(Hist$density) #/length(pos)
# H<-H/sum(H)
X<-Hist$mids
plot(X,H,type="p",xlab="horizontal increment",main="Left eye",sub="normal model" )
lines(X,dnorm(X,mean = mu, sd = sd),col = "blue")
#lines(X,dnorm(X,mean = mu, sd = sd),pch=3, lty=3, col = "blue")

# pdf("ReadPlot1.R.pdf", onefile=FALSE, height=8, width=6, pointsize=10)
# dev.off()


Karsten Ahnert


# entspricht dem 13. Februar 2009 23:31:30 Karsten Ahnert aufgabe34.R
# size_t time = 1234567890;
# 54.187385,7.873721 sollten diese Werte ergeben, das ist irgendwo auf Helgoland
# int mean_x = 195074686 , mean_y = 28345395;

kurtosis <- function(x) {
m4 <- mean((x-mean(x))^4)
kurt <- m4/(sd(x)^4)-3
kurt
}
skewness <- function(x) {
m3 <- mean((x-mean(x))^3)
skew <- m3/(sd(x)^3)
skew
}
k1 <- function( x ) {
  k <- mean(x)
  k
}
k2 <- function( x ) {
  k <- var(x)
  k
}
k3 <- function( x ) {
  m1 <- mean( x )
  m2 <- mean( x*x )
  m3 <- mean( x*x*x )
  k <- m3 - 2*m2*m1 + 2*m1*m1*m1
  k
}
k4 <- function( x) {
  m1 <- mean( x )
  m2 <- mean( x*x )
  m3 <- mean( x*x*x )
  m4 <- mean( x*x*x*x )
  k <- m4 - 4*m3*m1 - 3*m2*m2 + 12*m2*m1*m1 - 6 *m1*m1*m1*m1
  k
}

data <- read.table("../material/gps.dat")
mean(data)
var(data)
sd(data)
cor(data)
skewness(data)
kurtosis(data)

zlat = ( data$V2 - mean(data$V2) ) / sd( data$V2 )
zlon = ( data$V3 - mean(data$V3) ) / sd( data$V3 )
skewness( zlat )
kurtosis( zlat )
var( zlat )
mean( zlat )
skewness( zlon )
kurtosis( zlon )
var( zlon )
mean( zlon )
cor( zlat , zlon )






data2 <- read.table("../material/distribution.dat")
mean( data2 )
var( data2 )
k1( data2 )
k2( data2 )
k3( data2 )
k4( data2 )
mean( data2*data2 )
mean( data2*data2*data2 )
mean( data2*data2*data2*data2 )
hist( data2$V1 )


bound <- 10.0
data2clean <- data2$V1[ abs(data2$V1) < bound ]
# hist( data2clean )
test_data <- rcauchy( 10000 )
test_data <- test_data[ abs(test_data) < bound ]
qqplot( data2clean , test_data) 


Uebung 3 Karsten Ahnert

len <- 1000
data <- rlogis( len )
hist( data , freq=FALSE )
write.table( data , file="distribution2.dat" , col.names=FALSE , row.names=FALSE )

mu = mean( data )
sigma = sd( data ) / sqrt(len)
mu
sigma

konfidenz = c( 0.7 , 0.8 , 0.9 , 0.95 )
quants = ( 1.0 + konfidenz ) / 2.0
quants
diff = qnorm( quants , mean = 0 , sd = sigma )
diff
for( i in 1:length(diff) )  {
  print( c( mu , mu - diff[i] , mu + diff[i] ) )
}

# len <- 1000
# data <- rlogis( len )
# hist( data , freq=FALSE )
# write.table( data , file="distribution2.dat" , col.names=FALSE , row.names=FALSE )

ds = read.table( file="distribution2.dat" )
data <- ds$V1;
len = length( data )

mu = mean( data )
sigma = sd( data )
mu
sigma
sigma = sigma / sqrt(len)

konfidenz = c( 0.7 , 0.8 , 0.9 , 0.95 )
quants = ( 1.0 + konfidenz ) / 2.0
quants
diff = qnorm( quants , mean = 0 , sd = sigma )
diff
for( i in 1:length(diff) )  {
  print( c( mu , mu - diff[i] , mu + diff[i] ) )
}

# chisq.R
df <- 10
chii <- function( x , df ) {
  tmp <- 1.0 / ( 2.0^(df/2.0) * gamma(df/2.0) ) * x^(df/2-1) * exp(-x/2)
  tmp[ x<0.0 ] <- 0.0
  tmp
}
x <- seq( -10 , 30 , 0.25 )
plot( x , dchisq( x , df ) , type = "o" , ylim = c(-0.5,0.5) )
points( x , chii( x , df ) , type = "l" , col = "red" )

# distribution1.R
data <- rnorm( 10E5 )

num_of_samples = 1000
sample_length = 100

means <- replicate( num_of_samples, mean( sample( data , sample_length ) ) )
vars <- replicate( num_of_samples, var( sample( data , sample_length ) ) )

hist( means , freq=FALSE )

hist( vars*(sample_length-1), freq=FALSE )
x <- seq( 40 , 180 , 0.1 )
points( x , dchisq( x , df=sample_length-1 ) , type = "l" )

# distribution2.R
data <- rexp( 10E5 )
hist( data )
mean(data)
sd(data)

num_of_samples = 100
num_of_breaks = 10


mean_10 <- replicate( num_of_samples, mean( sample(data,10) ) )
mean_20 <- replicate( num_of_samples , mean( sample(data,20) ) )
mean_100 <- replicate( num_of_samples , mean( sample(data,100) ) )
mean_200 <- replicate( num_of_samples , mean( sample(data,200) ) )

h10 <- hist( mean_10 , breaks=num_of_breaks , plot=FALSE )
h20 <- hist( mean_20 , breaks=num_of_breaks , plot=FALSE )
h100 <- hist( mean_100 , breaks=num_of_breaks , plot=FALSE )
h200 <- hist( mean_200 , breaks=num_of_breaks , plot=FALSE )

plot( h10$mids , h10$density , type = "o" , ylim=c(0,6) , xlab = "mean" , ylab = "P(mean)" )
points( h20$mids , h20$density , type = "o" , col = "red" )
points( h100$mids , h100$density , type = "o" , col = "green" )
points( h200$mids , h200$density , type = "o" , col = "blue" )

Uebung 4 Karsten Ahnert

ts1.R

len <- 1000
sigma = 1.37
x <- runif( len , min = -10.0 , max = 10.0 )
y <- 9.81 + 0.25*x*x +0.001*x*x*x*x
#+ 1.5*sin( x/2.0 )
y <- y + rnorm( len , sd = sigma )
write.table( list( x , y ) , file = "CV.dat" , col.names = FALSE , row.names = FALSE )

ind=seq(1000)
test_ind=sample(ind,500)
eval_ind=sample(ind,500)

z=data.frame(x,y)
xtest<-z$x[test_ind]
ytest<-z$y[test_ind]
xeval <-z$x[eval_ind]
yeval <-z$y[eval_ind]

Ndegrees=8
chi2_test <- vector(length=Ndegrees)
chi2_eval <- vector(length=Ndegrees)
for( i in 1:Ndegrees ) {
  fit <- lm( ytest ~ poly( xtest , degree=i ) )
  chi2_test[i] = mean(summary(fit)$residuals^2)
  pred = predict(fit,data.frame(xeval))
  res  = yeval - pred
  chi2_eval[i] = mean( res^2 )
}

plot( chi2_test - min(chi2_test),log="y")
lines( chi2_eval - min(chi2_eval))

#ts1.R#
len <- 1000
sigma = 0.37

#x <- runif( len , min = -10.0 , max = 10.0 )
#y <- 9.81 + 0.25*x + 1.5*sin( x/2.0 )
#y <- y + rnorm( len , sd = sigma )
#write.table( list( x , y ) , file = "regression1.dat" , col.names = FALSE , row.names = FALSE )

data <- read.table( "regression1.dat" );
x <- data$V1
y <- data$V2


chi2 <- vector()
for( i in 1:25 ) {
  fit <- lm( y ~ poly( x , degree=i ) )
  ynorm = ( y - fit$fit ) / sigma
  chi2 = c ( chi2 , sum( ynorm * ynorm ) )
}
plot( chi2-min(chi2)+1 , log="y" )

plot( chi2 )

fit <- lm( y ~ poly( x , degree=20 ) )
plot( x , y )
lines( sort(x) , fit$fit[order(x)] , col = "red" , lw="5" )


Uebung5 TimeSeriesForecasting.R

#Forecasting time series (with functions in time series library ts)
#from a time series with trend and seasonality
  #Exponential smoothing can be used for short range predictions 
   #if data is trend&seasonality free.
  #Holt-Winters method generalizes the idea to time series with 
   #trend and seasonality 

#Excercise part 1: Decompose time series into trend, seasonality and residual component
#Excercise part 2: Forecast applying Holt-Winters

#Data : ElectricityProduction.dat
  #Australia monthly production of electricity 
  #Actual m.KWH
  #Source: ABS 8301.0
  #Jan 1956 - August 1995

#E <- read.table("...ElectricityProduction.dat");  
E <- ts(E,start=1956,freq=12); #convert to time series object


#Trend component can be found by moving average (or other linear filters):
E.ma <- filter(E,filter=rep(1/12,12));
plot(E,type="l"); lines(E.ma,col="purple")
#Components(trend, seasonal and residual) can be found :
plot(stl(E[,1],s.window="periodic"));



#Forecasting
  #Applying Holt-Winters: 
HoltWinters(E)
plot(E); lines(HoltWinters(E)$fitted[,1],col="red")

#The function predict() makes perdictions from a model fitted to a time series:
E.HW <- HoltWinters(E)
predict(E.HW,n.ahead=12)
plot(E,xlim=c(1956,1999)); lines(predict(E.HW,n.ahead=48),col=2)



#Kalman Filtering

# Generate a sequence of the observable with 1000 observations
# sampling chosen "appropriately", i.e. by experimenting with dt
N  <- 1000
dt <- 1.0/N
t  <- seq(0,1,by=dt)
N  <- length(t)
y  <- sin(100*t+4*sin(10*t))


# set the variance of the observable
eps<- 0.1
Cy <- eps*eps
# Fehler: yMeasOut <- y + rnorm(length(t),sd=Cy)
yMeasOut <- y + rnorm(length(t),sd=eps)


# First guess for Covariance of state
E  <- matrix(c(1,0,0,1), nrow = 2, ncol=2)
I  <- diag(2)


# set the variance of the state and calculate covariance matrix of state
eta<- 1e-1
Cx <- eta*eta*I


# initialize x
x  <- c(.1,0.)


# Define the state evolution by x_t = A x_{t-1} 
A  <- matrix(c(2,-1,1,0), nrow=2, ncol=2)


yPredOut <- rep(0,length(t))

# Loop over time
for(i in 1:N)
{

  ypred <- sin(x[1])
  ymeas <- yMeasOut[i]
 
  # could be done in advance by vector Operation,
  # placed here for pedagogical reasons
  # B is a 1x2 matrix
  B <- matrix(c(cos(x[1]),0),nrow=1,ncol=2)

  # Now calculate the Kalman gain matrix
  K   <- E%*%t(B)
  tmp <- 1.0/( B%*%E%*%t(B) + Cy  )
  K   <- K%*% tmp
 
  # estimate state
  x   <- x + K*(ymeas-ypred)
 
  # correct state error matrix
  E <- (I-K%*%B)%*%E
 
  # predict next state
  x <- A%*%x
 
  # next error matrix
  E <- A%*%E%*%t(A)+Cx

  yPredOut[i] <- ypred
 
  # print(i)
  # print(yPredOut[i])
}

require(graphics)
plot(t,yMeasOut,type="n",ylab = "y_{measured}")
points(t,yMeasOut,pch=21,bg="blue",col="blue",cex=0.4)
text <- sprintf("Predicted vs. Measured Observable, sigma=%1.2e",eta)
title(main = text )
legend("topright","y_measured",pch=21,col="blue")#, title="measured")

lines(t,yPredOut,col="black",lwd=1.5)
legend("bottomright", lty=1,"y_predicted")#, title="predicted")

Uebung6 Karsten Ahnert

forward.R

# define the HMM

a <- numeric( 4 )
dim( a ) <- c( 2 , 2 )
a[1,1] <- 0.2
a[1,2] <- 0.8
a[2,1] <- 0.35
a[2,2] <- 0.65

b <- numeric( 4 )
dim( b ) <- c( 2 , 2 )
b[1,1] <- 0.4
b[1,2] <- 0.6
b[2,1] <- 0.1
b[2,2] <- 0.9

rand_int_trans <- function( state ) {
  rand <- runif( 1 )
  if( state == 'A' ) {
    if( rand <= a[1,1] ) ret = 'A'
    else ret = 'B'
  }
  else {
    if( rand < a[2,1] ) ret = 'A'
    else ret = 'B'
  }
  ret
}
rand_obs <- function( state ) {
  rand <- runif( 1 )
  if( state == 'A' ) {
    if( rand < b[1,1] ) ret = 1
    else ret = 2
  }
  else {
    if( rand < b[2,1] ) ret = 1
    else ret = 2
  }
  ret
}

len <- 100
initial_state <- 'A'
states <- numeric( len )
observations <- numeric( len )
states[1] <- initial_state
observations[1] <- rand_obs( states[1] )
for( i in 2:len ) {
 states[i] <- rand_int_trans( states[i-1] )
 observations[i] <- rand_obs( states[i] )
}

# write.table( observations , file="observation.dat" , col.names=FALSE , row.names=FALSE )

# data <- read.table( "observation.dat" )
# observation <- data$V1
# len <- length( observation )

alpha <- numeric( 2*len )
dim( alpha ) <- c( len , 2)

alpha[ 1 , 1 ] = b[ 1 , observations[1] ] * 1
alpha[ 1 , 2 ] = b[ 2 , observations[1] ] * 0

for( i in 2:len ) {
  alpha[ i , 1 ] = ( alpha[ i-1 , 1 ] * a[ 1 , 1 ] + alpha[ i-1 , 2 ] * a[ 2 , 1 ] ) * b[ 1 , observations[i] ]
  alpha[ i , 2 ] = ( alpha[ i-1 , 1 ] * a[ 1 , 2 ] + alpha[ i-1 , 2 ] * a[ 2 , 2 ] ) * b[ 2 , observations[i] ]
}

alpha

p = alpha[len,1] + alpha[len,2]
pa = alpha[len,1] / p
pb = alpha[len,2] / p
pa
pb

hmm.R

a11 <- 0.2
a12 <- 0.8
a21 <- 0.35
a22 <- 0.65
b11 <- 0.4
b12 <- 0.6
b21 <- 0.1
b22 <- 0.9
rand_int_trans <- function( state ) {
  rand <- runif( 1 )
  if( state == 'A' ) {
    if( rand <= a11 ) ret = 'A'
    else ret = 'B'
  }
  else {
    if( rand < a21 ) ret = 'A'
    else ret = 'B'
  }
  ret
}
rand_obs <- function( state ) {
  rand <- runif( 1 )
  if( state == 'A' ) {
    if( rand < b11 ) ret = 1
    else ret = 2
  }
  else {
    if( rand < b21 ) ret = 1
    else ret = 2
  }
  ret
}

len <- 10000
initial_state <- 'A'
states <- numeric( len )
observations <- numeric( len )
states[1] <- initial_state
observations[1] <- rand_obs( states[1] )
for( i in 2:len ) {
  states[i] <- rand_int_trans( states[i-1] )
  observations[i] <- rand_obs( states[i] )
}

num1 <- 0
num2 <- 0
for( i in 1:len ) {
  if( states[i] == 'A' ) num1 <- num1 + 1
  else num2 <- num2 + 1
}
num1
num2
num1 <- 0
num2 <- 0
for( i in 1:len ) {
  if( observations[i] == 1 ) num1 <- num1 + 1
  else num2 <- num2 + 1
}
num1
num2

states
observations

write.table( observations , file="observation.dat" , col.names=FALSE , row.names=FALSE )

hmm.R

source( "hmm_func.R" )
A <- matrix( c( 0.2 , 0.35 , 0.8 , 0.65 ) , nrow = 2 , ncol = 2 )
B <- matrix( c( 0.4 , 0.1 , 0.6 , 0.9 ) , nrow = 2 , ncol = 2 )
initial <- c( 1 , 0 )
org_hmm <- list( A=A , B=B , initial=initial )
rm( A , B , initial )
class(org_hmm) <- "hmm"
len <- 100
observation <- gen_observation( org_hmm , len )
#
A <- matrix( c( 0.3 , 0.5 , 0.7 , 0.5 ) , nrow = 2 , ncol = 2 )
B <- matrix( c( 0.5 , 0.5 , 0.5 , 0.5 ) , nrow = 2 , ncol = 2 )
initial <- c( 0.5 , 0.5 )
cur <- list( A=A , B=B , initial=initial )
remove( A , B , initial )
class( cur ) <- "hmm"
#
cur
alpha <- gen_alpha( cur , observation )
beta <- gen_beta( cur , observation )
gamma <- gen_gamma( alpha , beta )
xi <- gen_xi( cur , observation , alpha , beta )
cur <- estimate_new_hmm( gamma , xi , observation )
cur


for( i in 1:len ) {
  print(alpha[i,1]*beta[i,1] + alpha[i,2]*beta[i,2])
}

for( t in 1:(len-1) ) {
  s <- 0.0
  for( i in (1:2) )
    for( j in (1:2) )
      s <- s + alpha[t,i]*hmm$A[i,j]*hmm$B[j,observation[t+1]]*beta[t+1,j]
  print(s)
}

for( t in 1:(len-1) ) {
  for( i in 1:2 ) {
    s <- 0
    for( j in 1:2 ) { s <- s + xi[t,i,j] }
    print(c(gamma[t,i],s))
  }
}

hmm_func.R

rand_int_trans <- function( hmm , state ) {
  rand <- runif( 1 )
  ret <- 'B'
  if( state == 'A' ) { if( rand <= hmm$A[1,1] ) ret = 'A' }
  else { if( rand < hmm$A[2,1] ) ret = 'A' }
  ret
}

rand_obs <- function( hmm , state ) {
  rand <- runif( 1 )
  ret <- 2
  if( state == 'A' ) { if( rand < hmm$B[1,1] ) ret = 1 }
  else { if( rand < hmm$B[2,1] ) ret = 1 }
  ret
}

rand_initial_state <- function( hmm ) {
  rand <- runif(1)
  ret <- 'B'
  if( rand <- hmm$initial[1] ) ret <- 'A'
  ret
}

gen_observation <- function( hmm , len ) {
  observation <- numeric(len)
  state <- rand_initial_state( hmm )
  observation[1] <- rand_obs( hmm , state )
  for( i in 2:len ) {
    state <- rand_int_trans( hmm , state )
    observation[i] <- rand_obs( hmm , state )
  }
  observation
}

gen_alpha <- function( hmm , observation ) {
  
  len <- length(observation)
  alpha <- matrix( numeric( 2*len ) , nrow = len , ncol = 2 )

  alpha[ 1 , 1 ] = hmm$B[ 1 , observation[1] ] * hmm$initial[1]
  alpha[ 1 , 2 ] = hmm$B[ 2 , observation[1] ] * hmm$initial[2]

  for( i in 2:len ) {
    alpha[ i , 1 ] = ( alpha[ i-1 , 1 ] * hmm$A[ 1 , 1 ] + alpha[ i-1 , 2 ] * hmm$A[ 2 , 1 ] ) * hmm$B[ 1 , observation[i] ]
    alpha[ i , 2 ] = ( alpha[ i-1 , 1 ] * hmm$A[ 1 , 2 ] + alpha[ i-1 , 2 ] * hmm$A[ 2 , 2 ] ) * hmm$B[ 2 , observation[i] ]
  }
  alpha
}

gen_beta <- function( hmm , observation ) {

  len <- length(observation)
  beta <- matrix( numeric( 2*len ) , nrow = len , ncol = 2 )

  beta[ len , 1 ] <- 1
  beta[ len , 2 ] <- 1

  for( i in seq(len-1 , 1 , -1) ) {
    beta[ i , 1 ] <- hmm$A[1,1] * hmm$B[ 1 , observation[i+1] ] * beta[ i+1 , 1] +
                     hmm$A[1,2] * hmm$B[ 2 , observation[i+1] ] * beta[ i+1 , 2]
    beta[ i , 2 ] <- hmm$A[2,1] * hmm$B[ 1 , observation[i+1] ] * beta[ i+1 , 1] +
                     hmm$A[2,2] * hmm$B[ 2 , observation[i+1] ] * beta[ i+1 , 2]
  }

  beta
}

gen_gamma <- function( alpha , beta ) {

  len <- dim(alpha)[1]
  gamma <- matrix( numeric(2*len) , nrow = len , ncol = 2 )
  ind = 1
  nn = alpha[ind,1]*beta[ind,1] + alpha[ind,2]*beta[ind,2]
  
  for( i in 1:len ) {
    gamma[ i , 1 ] = alpha[i,1]*beta[i,1] / nn
    gamma[ i , 2 ] = alpha[i,2]*beta[i,2] / nn
  }
  gamma
}

gen_xi <- function( hmm , observation , alpha , beta ) {

  len <- length( observation )
  xi <- numeric(4*(len-1))
  dim(xi) <- c(len-1,2,2)

  for( t in 1:(len-1) ) {
    for( i in 1:2 ) {
      for( j in 1:2 ) {
        xi[ t , i , j ] = alpha[t,i] * hmm$A[i,j] * hmm$B[j,observation[t+1]] * beta[t+1,j] / nn
      }
    }
  }
  xi
}

estimate_new_hmm <- function( gamma , xi , observation ) {

  len <- dim(gamma)[1]
  initial <- gamma[1,]

  A <- matrix( rep(0,4) , ncol=2 , nrow=2 )

  sg <- c(0,0)
  for( t in 1:(len-1) ) {
    for( i in 1:2 ) {
      sg[i] <- sg[i] + gamma[t,i]

      for( j in 1:2 ) A[i,j] <- A[i,j] + xi[t,i,j]
    }
  }
  for( i in 1:2 )
    for( j in 1:2 )
      A[i,j] <- A[i,j] / sg[i]
  
#  for( i in 1:2 ) {

#    sg <- 0
#    for( t in 1:(len-1) ) sg <- sg + gamma[t,i]

#    sxi <- c(0,0)
#    for( j in 1:2 ) for( t in 1:(len-1) ) sxi[j] <- sxi[j] + xi[t,i,j]

#    A[i,1] <- sxi[1] / sg
#    A[i,2] <- sxi[2] / sg

#    print( c( sg , sxi[1] , sxi[2] ) )

#  }

  B <- matrix( rep(0,4) , ncol=2 , nrow=2 )

  for( i in 1:2 ) {

    sg <- 0
    for( t in 1:len ) sg <- sg + gamma[len,i]
    
    sg1 <- c( 0,0 )
    for( k in 1:2 ) for( t in 1:len ) if( observation[t]==k ) sg1[k] <- sg1[k] + gamma[t,i]

    B[i,1] <- sg1[1] / sg
    B[i,2] <- sg1[2] / sg
  }




  new <- list( A=A , B=B , initial=initial )
  class(new) <- "hmm"
  new
}

cv.R

require(graphics)

## Predictions

len=12
x <- seq(-2,2,length.out=len)
y <- x*x + rnorm(len)
d=seq(-2,2,length.out=100)


Ndegrees=7
chi2_test <- vector(length=Ndegrees)
chi2_eval <- vector(length=Ndegrees)

plot(x,y)
for( i in 1:Ndegrees )
{
  fit <- lm( y ~ poly( x ,i ) )
  chi2_test[i] = mean(summary(fit)$residuals^2)

  #gebe jedem fit einen anderen Namen
  #assign benennt, paste baut den Namen zusammen
  #sep im Argument von paste haengt die strings mit . statt Leerzeichen aneinander
  #die fits heissen jetzt fit.1, fit.2, ...
  assign(paste('fit',i,sep='.'),fit)

  lines(d, predict(fit,data.frame(x=d)),lty=(i+1))
}

plot(chi2_test,main="chi^2 der Test Daten (in-sample)")

xe = seq(-2,2,length.out=1000)
ye= xe*xe+ rnorm(length(xe))

for( i in 1:Ndegrees )
{
  fit = get(paste("fit",i,sep="."))

  # Predict on new data
  pred = predict(fit,data.frame(x=xe))

  ## Calculate the residuals( chi^2)
  residuals = ye - pred
  chi2_eval[i] = mean(residuals*residuals)
}
plot(chi2_eval,main="chi^2 der Evaluierung (out-of-sample)")

gcv.R
library(boot)

len <- 200
sigma = 0.37
x <- runif( len , min = -10.0 , max = 10.0 )
y <- 9.81 + 0.25*x + 1.5*sin( x/2.0 )
y <- y + rnorm( len , sd = sigma )
data <- data.frame( x=x , y=y )

maxdegree <- 20
cv <- numeric( 20 )
for( d in 1:20 ) {
  fit <- glm( y ~ poly(x , degree=d ) , data=data )
  cv[d] <- cv.glm( data , fit )$delta[2]
}
plot( cv-min(cv) , log="y" )

spline_interpolation.R
time <- seq( 0 , 10 , 0.5 )
len <- length(time)
data <- 0.45 * sin( time ) - 0.12 * time
noise <- rnorm( len , sd = 0.2 )
data <- data + noise

plot( time , data , type="op" )
inter1 <- splinefun( time , data )
inter2 <- splinefun( time , data , method="monoH.FC" )
inter3 <- splinefun( time , data , method="periodic" )
inter4 <- splinefun( time , data , method="natural" )
intertime <- seq( 0 , 10 , 0.01 )
lines( intertime , inter1(intertime) , type="l" , col="red" )
lines( intertime , inter2(intertime) , type="l" , col="green" )
lines( intertime , inter3(intertime) , type="l" , col="blue" )
lines( intertime , inter4(intertime) , type="l" , col="yellow" )

spline_smoothing.R
#smooth.spline(x, y = NULL, w = NULL, df, spar = NULL,
#              cv = FALSE, all.knots = FALSE, nknots = NULL,
#              keep.data = TRUE, df.offset = 0, penalty = 1,
#              control.spar = list())

time <- seq( 0 , 10 , 0.1 )
len <- length(time)
data <- 0.45 * sin( time ) - 0.12 * time
noisydata <- data + rnorm( len , sd = 0.2 )

plot( time , noisydata , type="o" )
smooth <- smooth.spline( time , noisydata ,nknots=80 )
lines( time , data , col="blue" , lw=3 )
lines( smooth$x , smooth$y , col="red" , lw=1 )
smooth$fit$knot
smooth$lambda
str(smooth)

Uebung 8 Karsten Ahnert

hopfield.R
update_hnn<- function( T , state ) {
  tmp <- T %*% state
  indices = ( state*tmp < 0 )
  state[ indices ] = -state[ indices ]
  state
}
train_hnn <- function( T , pattern ) {
  tmp <- pattern %*% t(pattern)
  T <- T + tmp
  T
}

len1 <- 4
len2 <- 4
len <- len1*len2
T <- matrix( rep( 0 , len*len) , ncol=len , nrow=len )
T

tr_pattern1 <- c(1,1,-1,-1,1,1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1)
tr_pattern2 <- c(-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,-1,-1,1,1)

T <- train_hnn( T , tr_pattern1 )
T <- train_hnn( T , tr_pattern2 )

pattern <- c(-1,1,-1,-1,-1,-1,-1,-1,-1,-1,1,1,1,-1,1,1)
state <- pattern

state <- update_hnn( T , state )

nnet.R
library(nnet)

tr_pattern1 <- c(1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0)
tr_pattern2 <- c(0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1)
tr_pattern <- t( matrix( c(tr_pattern1 , tr_pattern2 ) , ncol = 2 , nrow = 16 ) )
targets <- class.ind( c( "x" , "y" ) )
ir1 <- nnet( tr_pattern , targets , size=2 , rang=0.1 )
pattern <- c(1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0)
predict( ir1 , pattern )


nnet.R~
library(nnet)

test.cl <- function(true, pred) {
    true <- max.col(true)
    cres <- max.col(pred)
    table(true, cres)
}

d1 <- 4
d2 <- 4
tr_pattern1 <- matrix( c(1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0) , ncol=d1 , nrow=d2 )
tr_pattern2 <- matrix( c(0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1) , ncol=d1 , nrow=d2 )

tr_pattern <- data.frame( matrix( c( as.vector(pattern1) , as.vector(pattern2) ) , ncol = d1*d2 , nrow=2 ) )
targets <- class.ind( c( "1" , "2" ) )
ir1 <- nnet( tr_pattern , targets , size=2 , rang=0.1 )

pattern <- matrix( c(1,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0) , ncol=d1 , nrow=d2 )
# pattern <- data.frame( matrix( as.vector(pattern) , ncol=d1*d2 , nrow=1 ) )
test.cl( as.vector(pattern) , predict( ir1 , as.vector(pattern) ) )

Uebung10 Karsten Ahnert

bernoulli.R
len = 1000
x <- numeric( len )
x[1] = 0.2453456456456
for( i in 2:len ) {
  if( x[i-1] < 0.5 ) x[i] = 2 * x[i-1]
  else x[i] = 2.0 * x[i-1] - 1.0
}
write.table( x , file="bernoulli.dat" , col.names=FALSE , row.names=FALSE )

 henon.R
len <- 10000
a <- 1.4
b <- 0.3
x <- matrix( numeric( 2*len ) , len , 2 )
x[1,1] = 0.5
x[1,2] = 0.5
for( i in 2:len ) {
  x[i,1] = a - x[i-1,1] * x[i-1,1] + b * x[i-1,2]
  x[i,2] = x[i-1,1]
}
x
write.table( x , row.names=FALSE , col.names=FALSE , file="henon.dat" )

uebung11.R
require(Rwave)

N=1024
t=seq(1:N)

#Sinus
y=sin(2*pi*32.0*t/N)

x=cwt(y,6,12)
x=cwt(y,6,6)
x=cwt(y,4,6)
x=cwt(y,10,6)

#Frequenzmodulierung
y2 <- sin(2*pi*32/N*t+10*sin(2*pi*4/N*t))
fac <- sin( t / N * pi )
y2 <- y2 * fac
plot(y2,type="l")

cwt(y2,5,10)

x2=cwt(y2,6,10)



#weisses Rauschen
y3=rnorm(1024)

x3=cwt(y3,6,11)
x3=cwtTh(y3,6,11,9)
x3=cwtTh(y3,6,11,50)


Home page