Documente Academic
Documente Profesional
Documente Cultură
Analysis
With Python
A. Beck
Introduction
Data Analysis With Python
Using
Python
Basic
Python
Arnaud Beck
Scipy
Data I/O
Visualization
Laboratoire Leprince-Ringuet, cole Polytechnique, CNRS/IN2P3
Data
Analysis
With Python
A. Beck
Introduction
1 Introduction
Using
Python
Scipy
4 Scipy
5 Data I/O
6 Visualization
Why come to Python ?
Data
Analysis
Should I use low-level,compiled language or an interpreted language ?
With Python Commercial or open source ?
A. Beck
Introduction
C/C++ Matlab Python
Using Easy and flexible X X
Python
Basic Performances X
Python
Scipy
Free and available on any system X X
Data I/O
Visualization
Why stick to Python ?
Data
Analysis
Python is distinguished by its large and active scientific computing community.
With Python There are people developing libraries for virtually anything.
A. Beck
Glue to other languages
Introduction
Data
Analysis
With Python
A. Beck
Introduction
1 Introduction
Using
Python
Scipy
4 Scipy
5 Data I/O
6 Visualization
Getting Python for data analysis
Data
Analysis
With Python
A. Beck
Basic
Python
Critical for data analysis
Scipy
Data I/O
Modules : Scipy, Matplotlib
Visualization
Application specific
Modules : mpi4py, VTK, pytable, etc.
Data
Analysis
Interactive mode in a Python shell
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Use of a script
Data I/O
Visualization
You can compile scripts into binary .pyc files. Mostly for developers.
IPython : a convenient and comfortable Python shell
Data
Analysis
With Python
A. Beck
Introduction
Interesting features
Using
Python Command history
Basic
Python Any Xterm command accessible via !
Scipy Commands auto-completion
Data I/O
Quick help through the use of ?
Visualization
Inline and interactive graphics
Timing and profiling tools
Many many more ...
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Outline
Data
Analysis
With Python
A. Beck
Introduction
1 Introduction
Using
Python
Scipy
4 Scipy
5 Data I/O
6 Visualization
Python is an object oriented language
Data
Analysis
With Python
A. Beck
Introduction
In Python, we do things with stuff !
Using
Python
things = operations
Basic
stuff = objects
Python
Scipy
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Strings
Ordered collection (or sequence) of characters
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
String Methods
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Lists
Sequence of any objects
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Slices
Manipulating sequences
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Importing modules
Data
Analysis
With Python
A. Beck
Modules define new object types and operations.
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
The large and growing Python users community provides an increasing number
of modules that already do what you need.
Outline
Data
Analysis
With Python
A. Beck
Introduction
1 Introduction
Using
Python
Scipy
4 Scipy
5 Data I/O
6 Visualization
The Scipy module
Data
Analysis
With Python
A. Beck Scipy is a collection of powerful , high level functions for mathematics and data
Introduction
management. It is based on the numpy.ndarray object type and vectorized
Using
operations. The operations are optimized and coded in C to deliver high
Python performances.
Basic
Python
Scipy
Data I/O
Visualization
If you are using a for loop, you are probably doing something wrong !
Creating an ndarray
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Manipulating ndarrays
Data
Analysis
With Python
A. Beck
Introduction
Slicing is still the basis of array manipulation.
Using
Python Reshape > Change number and size of dimensions of the array.
Basic
Python
Sort > Quite self explanatory.
Scipy Delete, insert, append > Remove or add parts of the array.
Data I/O
Squeeze, flatten, ravel > More ways to control dimensionality of the array.
Visualization
Transpose,swapaxes, rollaxis > More ways to arange the dimensions as
you want
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Data
Analysis
With Python
A. Beck
Introduction
1 Introduction
Using
Python
Scipy
4 Scipy
5 Data I/O
6 Visualization
Reading data
Data
Analysis
With Python
A. Beck
Introduction
The whole game is to fit your data in a ndarray.
Using
Python
Basic
Python
Scipy
data = scipy.fromfile("file",dtype=float32,count=-1,sep=" ")
Data I/O
Visualization Works with raw binary files and ASCII files but not very flexible.
data = scipy.loadtxt("file",skiprows=0,delimiter=",")
Data
Analysis
With Python
A. Beck
The file object is a basic python type. It is created by
Introduction
Using
Python
Basic
Python
fid = open("filename","r")
Scipy
"r" for read, "w" for write.
Data I/O
Visualization
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Quick words about reading HDF5 files
Data
Analysis
Reading HDF5 files is module dependant. You can use either tables or h5py
With Python for instance.
A. Beck
Introduction These modules coexist well with Scipy and load data directly into ndarray.
Using
Python
Scipy
Data I/O
Visualization
Writing data
Data
Analysis scipy.save("file",ndarray) and scipy.load("file") in order to use
With Python
the binary scipy format to store arrays.
A. Beck
ndarray.tofile() in order to store an array in a text file or raw binary.
Introduction
fileobject.write("any_string") to write a string in a text file.
Using
Python The h5py and tables modules are used to write HDF5 files.
Basic
Python VTK script
Scipy
Data I/O
Visualization
Outline
Data
Analysis
With Python
A. Beck
Introduction
1 Introduction
Using
Python
Scipy
4 Scipy
5 Data I/O
6 Visualization
Visualization workflow
Data
Analysis
With Python
A. Beck
Introduction
Python Python
Using Raw Data Postprocessed data Formated data file
Python
Basic
Python
Scipy
Data I/O
Python, matplotlib
Visualization
Visualization software
Paraview, Visit, Mayavi ...
Plot
Visualization
Matplotlib : the figure object
Data
Analysis
fig = figure([options])
With Python
Data I/O
Normalized to maximum
Visualization
Operations include :
Title and axis labels 0.6
fig.xlabel("string")
Axis ticks and extent 0.4
fig.ticks(ndarray)
Injected Charge
Display a colorbar
0.2 Bubble Size
fig.colorbar()
Laser Amplitude
Display a legend
fig.legend() 0.0
0 5 10 15
fig.savefig()
Matplotlib : Simple plots
Data
Analysis
plot(x,y,[options])
With Python
A. Beck
Basic
All typical options are here : lines (style, color, width ...), markers (size, shape,
Python colors ...), labels for legend, antialiasing, transparency, many more ...
Scipy
Data I/O
Visualization
Matplotlib 2D plots : imshow and pcolor
Data
Analysis
With Python
Using
Python
Basic
Python
1.0 100 1.0
Scipy
0.9 0.9
0
Data I/O
0.8 80 0.8
Visualization
20
0.7 0.7
0.6 60 0.6
40
0.5 0.5
60
0.4 40 0.4
0.3 0.3
80
0.2 20 0.2
0.1 0.1
0 20 40 60 80
0.0 0 0.0
0 20 40 60 80 100
2D plots with a little bit of tuning
Data
Analysis
With Python
A. Beck 3.0
60
Introduction
Using
40
Python
Basic
Python
20 2.0
Scipy
y [m]
Data I/O
0
Visualization
20 1.0
40
60
0.0
60 40 20 0
x ct [m]
Other features of matplotlib
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
Matplotlib has native LATEX rendering
Data
Analysis
label = r"$Math \LaTex code$"
With Python
A. Beck
Introduction
Using
Python
Basic
Python
Scipy
Data I/O
Visualization
The futur of visualization in Python
Data
Analysis
With Python
A. Beck
Introduction
Using
Python
Scipy
New modules are emerging : Chaco, MayaVi, Bokeh, stressing interactivity and
Data I/O
Visualization
dynamic data visualizations in web browsers and in 3D.
What you saw today is extremely basic and is only a tiny part of what Python is
capable of.