Home page
URL:
http://www.agnld.uni-potsdam.de/~shw/Lehre/lehrangebot/2007SS-DAM/2007SS-DAM.html
Sommersemester 2007
Test (Klausur): July 20, 2007 (20.7.07) from 11:00 to 12:30 in room 4.22 (air conditioned computer lab).
Sufficient condition to get the ECT-points: > 50 % of points of the test
J. Kurths, N. Marwan, H. Rust, N. Wessel, G. Zamora Lopez (V)
Fr 11.00-12.30 1.19.4.15
N. Marwan, G. Zamora Lopez, H. Rust & U. Schwarz (Exercises) (shw AT agnld.uni-potsdam.de)
Tue 13:15-14:45, Haus 19, Room 423 (Computerpool)
Thu 15:15-16:45, Haus 19, Room 423 (Computerpool) or 316
46. Nonlinear data analysis and modeling in sciences
This lecture is one of the ``wahlobligatorischen Vorlesungen'' for the
``Wahlpflichtfach 1'' in ``Nichtlinearer Dynamik''.
Using several methods of data and time series analysis (moments, histogram,
statistical tests, correlation function, EOF's, spectra, wavelets, phase space
reconstruction, recurrence plots, symbolic dynamics) we will consider data of models
(maps, ode's, ARMA processes, Langevin equations, stochastic
differential equations, networks) and measured data.
As the focus of the lectures is on the analysis techniques of data analysis, we
choose examples from various fields stressing applications which do not make
part of ''classical physics'':
1. Heart dynamics. Variability of cardiovascular signals - symbolic dynamics, AR models
2. Evolution of bone structure - image processing, symbolic dynamics
3. Palaeoclimate - spectra, recurrence, synchronization
4. Natural hazards, extreme events (floods, storms, Earth quakes)(statistics using R)
5. Cognition & brain networks
6. Complex networks (neural network, cell metabolism)
7. Visualization of general circulation model data
The lectures will complement lectures on nonlinear dynamics.
It is thought to bridge between classical physics education and
problems from neighbouring disciplines, preparing the participants for
interdisciplinary research.
Knowledge of a programming language is helpful but not necessary.
Course assessment: Solve exercises, test in the computer lab
Exercises:
Please sent the used program codes to SHW at AGNLD.Uni-Potsdam.de.
The code should contain the discussion of your results.
- Stochastic processes and samples. Estimation moments, probability density functions and
entropies of random variables
a] Plot a Gaussian noise corrupted (observational/additive noise) position of one component x(t) the
harmonic oscillator [t=(1:N)*dt; plot(t,sin(2*pi*t/Period),'d-')] for
three different cases of SNR (signal noise ratio: SNR1>>1, SNR2=1, SNR3<<1). (1 point)
b] Plot the three corresponding estimates of the density distribution p(x). (1 point)
c] Estimate the first two moments. (1 point)
d] Estimate the entropies [-sum(log(p(find(p))).*p(find(p)))] of appropriate binned samples. (1 point)
e] Estimate entropies of
texts.
x=textread('A.txt','%c'); x=double(x); double('A'), char(228), mx=x(x<123).
(String handling) (1 point)
f] Deliver the used commented Matlab scripts or functions. (1 point)
Date due: April 26
-
Symbolic dynamics: Shannon entropy, Renyi entropy, mutual information, algorithmic complexity
a] Calculate the Shannon entropy, Renyi entropy (q=0.2 and q=2),
mutual information
(gnuplot and TISEAN: plot"<./mutual -b50 -D30 Random.dat"w lp) and
algorithmic complexity of a random telegraph sample
[floor(rand(N,1)+0.5)], a periodic signal [x=floor((sin(2*pi/2.1*t)+1));] and trajectories of the logistic map for
different control parameters depending on the sample size N.
Useful matlab script:
symbolics.m. (12 points)
Date due: May 10
- DFT, periodogram, power spectra
a] Compute the discrete Fourier transforms and the periodograms of 4 samples of
a Gaussian noise process and a harmonic process.
b] Estimate the power spectra of both processes.
c] Estimate the acf, the periodogram ([Pxx f]=periodogram(y);=pmcov(y,p);=pyulear(y,p);)
and the power spectrum (Welch, Yule-Walker) of time series of different
typical processes (Gaussian noise, harmonic process) and systems (x component of the
Roessler system,
z component of the Lorenz system :
ode3solv.m,
ode3.m).
d] Estimate the acf, the periodogram and the power spectrum of simulated
(Earth orbit
data (unit is kilo years, insolation for lat=44N, prc=precession angle of the Earth axis,
obl=schiefe of the Earth axis, ecc=eccentricity of the Earth orbit),
x=load('orbital'), plot(x.inso(:,1),x.inso(:,2)) ) and measured time series (
δ18O ice core data).
Note that some of the data are not even sampled!
Please give the right units at the axis. Calculate the Nyquist frequency in all cases.
12 points
Date due: May 24
- AR processes 12 points
a] Plot a sample of two typical AR[1] processes: a_1=0.8 and a_1=-0.6, noise varainace =1.
Estimate the acf and the power spectrum (smoothed periodogram and Yule-Walker).
b] Recover the parameters of AR[1] processes using the samples of exercise 4a].
[est_a var_eps]=arcov(y,1).
c] Plot a sample of an AR[2] process (a_1=1.6, a_2=-0.99, noise variance=1).
Use the matlab command to generate AR process samples: r=randn(n,1); b=1;
a=[1 -1.6 0.99]; y=filter(b,a,x). Note the matlab sign convention!
Estimate the acf and the power spectrum.
Compare the theoretical frequency far=1/(2*pi)* acos(-a(2)/2/ sqrt(a(3))) with the estimated one.
Estimate the order of the AR[2] process from the given sample.
Plot the final prediction error vs. the model order.
Date due: June 7
- Cluster analysis 12 points
File
trilobites.m contains morphological measurements of
fossil trilobites.
The raws contain species, the columns contain measurements.
Morover, the names of the trilobites species are provided.
a] Calculate the distance matrix (using Euclidean distance)
and denote the axes. Comment on the result. Calculate further distance
matrices using other distance functions. You can find a selection
of such distance functions by calling "help pdist".
(Y = pdist(data); Y = pdist(data,'correlation');
imagesc(squareform(Y))
xlabel('Taxa'), ylabel('Taxa')
colormap(esa), colorbar
set(gca,'xtick',[1:length(labels2)],'xtickl',labels2,'ytickl',labels2)
title('Euclidean distance between pairs of taxa','fontw','b'))
b] Perform an appropriate cluster analysis and interprete the
results. Would it be neccessary to renormalise the data before?
Plot the results in a dendrogram and denote the labels of the
x-axis by the abbreviations of the species. Comment on the result.
(link the clusters Z = linkage(Y); visualise the clusters
dendrogram(Z);
a = str2num(get(gca,'xtickl'));
set(gca,'xtickl',labels2(a))
box on
title('Cluster analysis Trilobiten taxa','fontw','b')
xlabel('Taxa'), ylabel('Distance') )
c] Check the cluster analysis by computing the
cophenetic
correlation coefficient. (cophenet(Z,Y))
Additional information:
Cluster anaysis using the Potts model.
Date due: June 14
- PCA, EOF & ICA 12 points
The file
climate.dat contains five measurements of some climate proxies.
a] Find the first principle components by using the command princomp
and plot the results! What can be stated about the variances of the
found PCs? Hint: [pcs, sPCA, v] = princomp(x);
plot(sPCA(:,1))
b] Apply a powerspectral analysis on the first three PCs. Which
frequencies can be found in these PCs?
Hint: pwelch(sPCA(:,1),[],[],[],1000)
c] Perform a FastICA on the climate proxies and plot the resulted
ICs! Comment on the results (compare it with the results of PCA)!
Hint: [ic w a] =
fastica(x);
plot(ic(:,1))
d] Apply a powerspectral analysis of on the found ICs. The sample time is 1kyear. Which frequencies
are dominant? Where such frequencies could come from?
Hint:[P,f]=pwelch(ic(:,1),[],[],[],1000);plot(f,log(P))
Date due: June 21
- Recurrence Quantification Analysis 12 points
The file
"eeg.mat"
contains
EEG measurements of an Oddball experiment
(cf. lecture).
The variable "data" contains the data during the
presented stimulus, the variable control contains the EEG data
without a stimulus, the variable time contains the corresponding
time scale in milli-seconds.
a] Estimate the embedding parameters dimension and delay by
using mutual information and false nearest neighbours method.
b] Apply a windowed RQA. Choose an appropriate window length and
window step. Compare the results of the control and stimuli data
for some single trials and for the mean over the trials.
Hints!
Date due: June 28
- Extreme Value Statistics I 12 points
Topics.
Lecture.
Statistics using R.
Assignments.
Data.
R script for assignment #1.
R,q(),pdf("output.pdf"),dev.off()
Date due: July 5
- Extreme Value Statistics II 12 points
Assignments.
Date due: July 12
- Complex Networks 12 points
Lecture *
Material 1 *
Exercise 1.
Material 2 *
Exercise 2.
Date due: July 19
- Test (Klausur): July 20, 2007 (20.7.07) from 11:00 to 12:30 in room 4.22 (air-conditioned computer lab).
Matlab
Starting command: matlab &
mkdir matlab startup.m addpath '/usr/HOME/Unterverzeichnis'
clear, clf
% all text right of % is just a comment
Summe.m
% function declaration
function z=Summe(x,y);
z=x+y
x=[0 1 5 0 3], log(x), find(x), x=~0, ~x, ~~x
Pfad fuer TISEAN-Binaries in der Bash-Shell setzen:
Die Datei .bashrc durch das Kommando
PATH=$PATH:/data/myscripts
export PATH
ergaenzen!
Gnuplot,
Gnuplot (Aufruf: gnuplot):
plot [1:12] sin (x) with line 5, exp(x)/2000
plot "< delay -d 1 henon.dat" w d
p [t=0:1] 3.5*t*(1-t), t
p 'name1' w l (i lp d st fst hist boxes) lt 2 lw 3, 'name2'
Einige gnuplot Kommandos zur Gestaltung des Plots:
set size square
set xlabel 'X' set ylabel 'X'
set autoscale xy
set yrange [0.5:2.5]
set xrange [-20.:20]
set size
set xlabel ""
set ylabel ""
unset logscale
set nologscale y
set nologscale x
set xrange [-180:180]
set yrange [-90:90]
unset key
unset xtics; unset ytics; unset title; unset border; set bmargin 0; set lmargin 0; set rmargin 0; set tmargin 0
plot '-' w l lt 1, 'uu0.dat' w d lt 3
-60 0
-30 0
-30 30
-60 30
-60 0
Rausschreiben einer PostScript-Datei unter gnuplot:
set term postscript landscape 'Helvetica' 14
set output 'name.ps'
splot "<./delay -m3 -d10 w l name.dat"
set term x11
ode3solv.m,
ode3.m.
Tipps zu TISEAN unter gnuplot:
Kovarianzfunktion:
plot"<./<corr -D200 AR2.dat"w lp
Powerspektren:
plot[0:0.1]"<./mem_spec -p40 -f400 AR2.dat"w lp
plot[0:0.1]"<./spectrum AR2.dat"w lp
Phasenraum-Darstellung:
plot "<./delay -m3 -d10 w l name.dat" oder mit splot "<./delay -m3 -d10 w l name.dat"
Rekurrenz-Darstellung:
plot "name.dat" w linesp
plot "<./recurr -d10 -r0.1 -%50 name.dat"
./recurr -d2 -r2 -%50 data.dat -o Darstellung per gnuplot plot "data.dat" w linesp und plot "data.dat.rec"
Mutual Information:
plot"<./mutual -b50 -D30 AR2.dat"w lp
Space-Time-Separation:
plot"<./stp -d2 -m2 AR2.dat"w lp
Bestimmung der Einbettungsdimension:
plot"<./false_nearest -m2 -M7 -d6 amplitude.dat" w l
Korrelations-Dimension:
d2 name.dat -d8 -t100 -o
plot 'name.dat.c2',.01*x**2.13
Lyapunov-Exponent:
lyap_k amplitude.dat -M6 -m3 -d8 -t100 -s500 -r.1 -o
plot[1:200]"amplitude.dat.lyap"w lp,-4.7+0.013*x
Lab mice:
henon -l10000 -o
Sun
spots,
Torus,
x Lorenz,
Noise,
ElNino,
AR 1,
AR 2,
Logistic map,
SIN+Offset+Noise,
harmonic
process,
Lorenz,
Lorenz,
Lorenz x,
Lorenz y,
Lorenz z,
Evaluation:
9 ECT-points all
Necessary condition to take part at the test: > 50 % of points of the exercises
Sufficient condition to get the ECT-points: > 50 % of points of the test
Literature:
J. Honerkamp, Stochastische Dynamische Systeme, VCH Weinheim 1990
R. Schlittgen & B.H. Streitberg, Zeitreihenanalyse, R. Oldenbourg, M"unchen 1997
Marek Fisz, Wahrscheinlichkeitsrechnung und mathematische Statistik
H. Kantz & T. Schreiber,
Nonlinear Time Series Analysis,
Cambridge University Press 1997 **
TISEAN 2.1:
http://www.mpipks-dresden.mpg.de/~tisean/TISEAN_2.1/index.html
H. Rinne, Taschenbuch der Statistik, Harry Deutsch 2003
W.H. Press, S.A. Teukolsky, W.T. Vetterling & B.P. Flannery:
Numerical Recipes in C, Cambridge University Press 1993
Script
Matlab - Signal Processing Toolbox
Home page