Home page


URL: http://www.agnld.uni-potsdam.de/~shw/Lehre/lehrangebot/2007SS-DAM/2007SS-DAM.html

Sommersemester 2007

Test (Klausur): July 20, 2007 (20.7.07) from 11:00 to 12:30 in room 4.22 (air conditioned computer lab).

Sufficient condition to get the ECT-points:       > 50 % of points of the test

J. Kurths, N. Marwan, H. Rust, N. Wessel, G. Zamora Lopez (V) Fr 11.00-12.30 1.19.4.15 N. Marwan, G. Zamora Lopez, H. Rust & U. Schwarz (Exercises) (shw AT agnld.uni-potsdam.de) Tue 13:15-14:45, Haus 19, Room 423 (Computerpool) Thu 15:15-16:45, Haus 19, Room 423 (Computerpool) or 316

46. Nonlinear data analysis and modeling in sciences

This lecture is one of the ``wahlobligatorischen Vorlesungen'' for the ``Wahlpflichtfach 1'' in ``Nichtlinearer Dynamik''.

Using several methods of data and time series analysis (moments, histogram, statistical tests, correlation function, EOF's, spectra, wavelets, phase space reconstruction, recurrence plots, symbolic dynamics) we will consider data of models (maps, ode's, ARMA processes, Langevin equations, stochastic differential equations, networks) and measured data. As the focus of the lectures is on the analysis techniques of data analysis, we choose examples from various fields stressing applications which do not make part of ''classical physics'':

1. Heart dynamics. Variability of cardiovascular signals - symbolic dynamics, AR models
2. Evolution of bone structure - image processing, symbolic dynamics
3. Palaeoclimate - spectra, recurrence, synchronization
4. Natural hazards, extreme events (floods, storms, Earth quakes)(statistics using R)
5. Cognition & brain networks
6. Complex networks (neural network, cell metabolism)
7. Visualization of general circulation model data

The lectures will complement lectures on nonlinear dynamics. It is thought to bridge between classical physics education and problems from neighbouring disciplines, preparing the participants for interdisciplinary research. Knowledge of a programming language is helpful but not necessary.

Course assessment: Solve exercises, test in the computer lab


Exercises:

Please sent the used program codes to SHW at AGNLD.Uni-Potsdam.de. The code should contain the discussion of your results.
  1. Stochastic processes and samples. Estimation moments, probability density functions and entropies of random variables
    a] Plot a Gaussian noise corrupted (observational/additive noise) position of one component x(t) the harmonic oscillator [t=(1:N)*dt; plot(t,sin(2*pi*t/Period),'d-')] for three different cases of SNR (signal noise ratio: SNR1>>1, SNR2=1, SNR3<<1). (1 point)
    b] Plot the three corresponding estimates of the density distribution p(x). (1 point)
    c] Estimate the first two moments. (1 point)
    d] Estimate the entropies [-sum(log(p(find(p))).*p(find(p)))] of appropriate binned samples. (1 point)
    e] Estimate entropies of texts. x=textread('A.txt','%c'); x=double(x); double('A'), char(228), mx=x(x<123). (String handling) (1 point)
    f] Deliver the used commented Matlab scripts or functions. (1 point)
    Date due: April 26

  2. Symbolic dynamics: Shannon entropy, Renyi entropy, mutual information, algorithmic complexity
    a] Calculate the Shannon entropy, Renyi entropy (q=0.2 and q=2), mutual information (gnuplot and TISEAN: plot"<./mutual -b50 -D30 Random.dat"w lp) and algorithmic complexity of a random telegraph sample [floor(rand(N,1)+0.5)], a periodic signal [x=floor((sin(2*pi/2.1*t)+1));] and trajectories of the logistic map for different control parameters depending on the sample size N.
    Useful matlab script: symbolics.m. (12 points)
    Date due: May 10

  3. DFT, periodogram, power spectra
    a] Compute the discrete Fourier transforms and the periodograms of 4 samples of a Gaussian noise process and a harmonic process.
    b] Estimate the power spectra of both processes.
    c] Estimate the acf, the periodogram ([Pxx f]=periodogram(y);=pmcov(y,p);=pyulear(y,p);) and the power spectrum (Welch, Yule-Walker) of time series of different typical processes (Gaussian noise, harmonic process) and systems (x component of the Roessler system, z component of the Lorenz system : ode3solv.m, ode3.m).
    d] Estimate the acf, the periodogram and the power spectrum of simulated (Earth orbit data (unit is kilo years, insolation for lat=44N, prc=precession angle of the Earth axis, obl=schiefe of the Earth axis, ecc=eccentricity of the Earth orbit), x=load('orbital'), plot(x.inso(:,1),x.inso(:,2)) ) and measured time series ( δ18O ice core data).
    Note that some of the data are not even sampled!
    Please give the right units at the axis. Calculate the Nyquist frequency in all cases.
    12 points
    Date due: May 24

  4. AR processes 12 points
    a] Plot a sample of two typical AR[1] processes: a_1=0.8 and a_1=-0.6, noise varainace =1. Estimate the acf and the power spectrum (smoothed periodogram and Yule-Walker).
    b] Recover the parameters of AR[1] processes using the samples of exercise 4a]. [est_a var_eps]=arcov(y,1).
    c] Plot a sample of an AR[2] process (a_1=1.6, a_2=-0.99, noise variance=1). Use the matlab command to generate AR process samples: r=randn(n,1); b=1; a=[1 -1.6 0.99]; y=filter(b,a,x). Note the matlab sign convention!
    Estimate the acf and the power spectrum. Compare the theoretical frequency far=1/(2*pi)* acos(-a(2)/2/ sqrt(a(3))) with the estimated one.
    Estimate the order of the AR[2] process from the given sample. Plot the final prediction error vs. the model order.
    Date due: June 7

  5. Cluster analysis 12 points
    File trilobites.m contains morphological measurements of fossil trilobites. The raws contain species, the columns contain measurements. Morover, the names of the trilobites species are provided.
    a] Calculate the distance matrix (using Euclidean distance) and denote the axes. Comment on the result. Calculate further distance matrices using other distance functions. You can find a selection of such distance functions by calling "help pdist". (Y = pdist(data); Y = pdist(data,'correlation'); imagesc(squareform(Y)) xlabel('Taxa'), ylabel('Taxa') colormap(esa), colorbar set(gca,'xtick',[1:length(labels2)],'xtickl',labels2,'ytickl',labels2) title('Euclidean distance between pairs of taxa','fontw','b'))
    b] Perform an appropriate cluster analysis and interprete the results. Would it be neccessary to renormalise the data before? Plot the results in a dendrogram and denote the labels of the x-axis by the abbreviations of the species. Comment on the result. (link the clusters Z = linkage(Y); visualise the clusters dendrogram(Z); a = str2num(get(gca,'xtickl')); set(gca,'xtickl',labels2(a)) box on title('Cluster analysis Trilobiten taxa','fontw','b') xlabel('Taxa'), ylabel('Distance') )
    c] Check the cluster analysis by computing the cophenetic correlation coefficient. (cophenet(Z,Y))
    Additional information: Cluster anaysis using the Potts model.
    Date due: June 14

  6. PCA, EOF & ICA 12 points
    The file climate.dat contains five measurements of some climate proxies.
    a] Find the first principle components by using the command princomp and plot the results! What can be stated about the variances of the found PCs? Hint: [pcs, sPCA, v] = princomp(x); plot(sPCA(:,1))
    b] Apply a powerspectral analysis on the first three PCs. Which frequencies can be found in these PCs? Hint: pwelch(sPCA(:,1),[],[],[],1000)
    c] Perform a FastICA on the climate proxies and plot the resulted ICs! Comment on the results (compare it with the results of PCA)! Hint: [ic w a] = fastica(x); plot(ic(:,1))
    d] Apply a powerspectral analysis of on the found ICs. The sample time is 1kyear. Which frequencies are dominant? Where such frequencies could come from? Hint:[P,f]=pwelch(ic(:,1),[],[],[],1000);plot(f,log(P))
    Date due: June 21

  7. Recurrence Quantification Analysis 12 points
    The file "eeg.mat" contains EEG measurements of an Oddball experiment (cf. lecture). The variable "data" contains the data during the presented stimulus, the variable control contains the EEG data without a stimulus, the variable time contains the corresponding time scale in milli-seconds.
    a] Estimate the embedding parameters dimension and delay by using mutual information and false nearest neighbours method.
    b] Apply a windowed RQA. Choose an appropriate window length and window step. Compare the results of the control and stimuli data for some single trials and for the mean over the trials.
    Hints!
    Date due: June 28

  8. Extreme Value Statistics I 12 points
    Topics. Lecture. Statistics using R. Assignments. Data. R script for assignment #1.
    R,q(),pdf("output.pdf"),dev.off()
    Date due: July 5

  9. Extreme Value Statistics II 12 points
    Assignments.
    Date due: July 12

  10. Complex Networks 12 points
    Lecture * Material 1 * Exercise 1.
    Material 2 * Exercise 2.
    Date due: July 19

  11. Test (Klausur): July 20, 2007 (20.7.07) from 11:00 to 12:30 in room 4.22 (air-conditioned computer lab).

Tools: matlab (basics), TISEAN, gnuplot

Matlab

Starting command: matlab &
mkdir matlab startup.m addpath '/usr/HOME/Unterverzeichnis'
clear, clf
% all text right of % is just a comment
Summe.m
% function declaration
function z=Summe(x,y);
z=x+y

x=[0 1 5 0 3], log(x), find(x), x=~0, ~x, ~~x
Pfad fuer TISEAN-Binaries in der Bash-Shell setzen:
Die Datei .bashrc durch das Kommando

PATH=$PATH:/data/myscripts
export PATH
ergaenzen!
Gnuplot, Gnuplot (Aufruf: gnuplot):
plot  [1:12] sin (x) with line 5, exp(x)/2000
plot "< delay -d 1 henon.dat" w d
p [t=0:1] 3.5*t*(1-t), t
p 'name1' w l (i lp d st fst hist boxes) lt 2 lw 3, 'name2' 
Einige gnuplot Kommandos zur Gestaltung des Plots:
set size square
set xlabel 'X' set ylabel 'X'
set autoscale xy
set yrange [0.5:2.5]
set xrange [-20.:20]
set size
set xlabel ""
set ylabel ""
unset logscale
set nologscale y
set nologscale x
set xrange [-180:180]
set yrange [-90:90]
unset key
unset xtics; unset ytics; unset title; unset border; set bmargin 0; set lmargin 0; set rmargin 0; set tmargin 0
plot '-' w l lt 1, 'uu0.dat' w d lt 3
-60     0
-30     0
-30     30
-60     30
-60     0


Rausschreiben einer PostScript-Datei unter gnuplot:

set term postscript landscape 'Helvetica' 14
set output 'name.ps' 
splot "<./delay -m3 -d10 w l name.dat"
set term x11
ode3solv.m, ode3.m.

Tipps zu TISEAN unter gnuplot:

Kovarianzfunktion:
plot"<./<corr -D200 AR2.dat"w lp

Powerspektren:
plot[0:0.1]"<./mem_spec -p40 -f400 AR2.dat"w lp
plot[0:0.1]"<./spectrum  AR2.dat"w lp

Phasenraum-Darstellung: 
plot "<./delay -m3 -d10 w l  name.dat" oder mit splot "<./delay -m3 -d10 w l name.dat"

Rekurrenz-Darstellung: 

plot "name.dat" w linesp
plot "<./recurr -d10 -r0.1 -%50 name.dat"

./recurr -d2 -r2 -%50 data.dat -o Darstellung per gnuplot plot "data.dat" w linesp und plot "data.dat.rec"

Mutual Information:
plot"<./mutual -b50 -D30 AR2.dat"w lp

Space-Time-Separation:
plot"<./stp -d2 -m2 AR2.dat"w lp

Bestimmung der Einbettungsdimension:
plot"<./false_nearest -m2 -M7 -d6  amplitude.dat" w l

Korrelations-Dimension:
d2 name.dat -d8 -t100 -o
plot 'name.dat.c2',.01*x**2.13

Lyapunov-Exponent:
lyap_k amplitude.dat -M6 -m3 -d8 -t100 -s500 -r.1 -o
plot[1:200]"amplitude.dat.lyap"w lp,-4.7+0.013*x

Lab mice:

henon -l10000 -o
Sun spots, Torus, x Lorenz, Noise, ElNino, AR 1, AR 2, Logistic map, SIN+Offset+Noise, harmonic process, Lorenz, Lorenz, Lorenz x, Lorenz y, Lorenz z,

Evaluation:

9 ECT-points all

Necessary condition to take part at the test: > 50 % of points of the exercises
Sufficient condition to get the ECT-points: > 50 % of points of the test

Literature:

J. Honerkamp, Stochastische Dynamische Systeme, VCH Weinheim 1990
R. Schlittgen & B.H. Streitberg, Zeitreihenanalyse, R. Oldenbourg, M"unchen 1997
Marek Fisz, Wahrscheinlichkeitsrechnung und mathematische Statistik
H. Kantz & T. Schreiber, Nonlinear Time Series Analysis, Cambridge University Press 1997 ** TISEAN 2.1:
http://www.mpipks-dresden.mpg.de/~tisean/TISEAN_2.1/index.html
H. Rinne, Taschenbuch der Statistik, Harry Deutsch 2003
W.H. Press, S.A. Teukolsky, W.T. Vetterling & B.P. Flannery: Numerical Recipes in C, Cambridge University Press 1993
Script
Matlab - Signal Processing Toolbox

Home page