Examples scripts are in the examples subdirectory of the HippoDraw installation. To get an overview of what the examples do, try running the script that runs each of the examples like the following.
> cd /usr/local/share/HippoDraw/examples > python -i run_examples.py Hit return to run named script Run static_vs_dynamic:
An empty canvas window and the Inspector will be created. Recall your shell window and hit return to run each example in turn.
See the Python extension module tutorial to see how to get started and getting help.
The next section lists some typical tasks one might want to do from Python. For each task, an example is listed to illustrate how it is done.
NTuple creation by reading ASCII file:
NTuple creation by reading ROOT file: NTuple creation by reading FITS file: Finding NTuple names from a file:The Python code is shown below.
""" Demonstraton of static versus dynamic histograms. Also demonstrates taking difference between two histograms and plotting it. @author Paul F. Kunz <Paul_Kunz@slacs.stanford.edu> """ import random from time import sleep from load_hippo import app, canvas from hippo import Display, NTuple, NTupleController # Create static histogram and add it to canvas. sthist = Display ( "Static Histogram" ) sthist.setTitle ( "Gaussian Distribution (static hist)" ) sthist.setLabel ( 'x', 'X' ) sthist.setRange ( 'x', 0, 100) sthist.setBinWidth ( 'x', 1. ) canvas.addDisplay ( sthist ) # Create second static histogram and add it to canvas sthists = Display ( "Static Histogram" ) sthists.setTitle ( "Gaussian Distribution (low statistics)" ) sthists.setLabel ( 'x', 'X' ) sthists.setRange ( 'x', 0, 100) sthists.setBinWidth ( 'x', 1. ) canvas.addDisplay ( sthists ) # Create empty NTuple and set the column label. # Setting the column labels sets the number of columns ntuple = NTuple ( ) ntuple.setTitle ( 'Gaussian Distribution' ) ntuple.setLabels ( ('X value', ) ) # Register the NTuple so the Inspector can see it. NTupleController.instance().registerNTuple ( ntuple ) # Create dynamic histogram and attach it to NTuple. # It will automatically adjust its range, title, and labels dyhist = Display ( "Histogram", ntuple, ('X value', ) ) canvas.addDisplay ( dyhist ) mean = 45 sigma = 10 gauss = random.gauss # Generate some data, fill the static histograms and NTuple. for i in range(10000): x = gauss( mean, sigma ) sthist.addValues ( (x, ) ) ntuple.addRow ( (x, ) ) if i < 1000 : sthists.addValues ( (x, ) ) # only fill with first 1000 # Print some statistics from static histogram # Could do same for dynamic datarep = sthist.getDataRep() print "Histogram :" print " Title : " + sthist.getTitle() print " Entries : %i" % sthist.numberOfEntries() print " Mean = %f" % datarep.getMean ( 'x' ) print " Rms = %f" % datarep.getRMS ( 'x' ) # Print the average X value on the display canvas.selectDisplay ( sthist) canvas.addTextRep ( sthist, 'averagex' ) canvas.selectDisplay ( dyhist) canvas.addTextRep ( dyhist, 'averagex' ) # Get the contents of the bins as a DataArray high = sthist.createDataArray() low = sthists.createDataArray () # Take difference with high statistics one scaled down, and a column # to the low one. low[ 'diff' ] = high[ 'Entries / bin' ] / 10 - low[ 'Entries / bin' ] # Create an XY plot to see the difference xyplot = Display ( "XY Plot", low, ( 'X', 'diff', 'nil', 'Error' ) ) canvas.addDisplay ( xyplot ) low.register ('difference ntuple' ) print "Static histogram is on the left, dynamic one on the below it." print "You can changing the binning of the dynamic one with the Inspector" print "\nOn the right we have low statistics gaussian with difference" print "between it and high statistics plotted below it"
The script first creates static histograms, sets their attributes, and adds it to the canvas. It then creates an empty NTuple, sets its title, and labels for the columns. Since there is only one string in the tuple of labels, the ntuple will have only one column. The dynamic histogram is then created, bound to the ntuple, and adding to the canvas.
The script generates some data. In the for loop, it fills one static histogram with all the generated data and the other one with only the first 1000 "events". It also adds a rows to the ntuple. Both histograms are initially empty and you can watch them grow when you run the script.
When the data generation is done, some statistical quantities are retrieved from the static histogram and printed. One can now use the Inspector Users Guide to change the attributes of the histograms. With the dynamic histogram one can change the size of the bins, but not with the static one. They are then added to overlay the plot.
Finally the we take the difference between the high and low statistics histograms. This is done by creating a DataArray which wraps an NTuple containing the contents of the bins. DataArray objects a column of data as a numarray using syntax of a Python dictionary. Numarray objects overload the arithmetic operators, so we can take the difference with vector algebra. Adding the created column to the small DataArray also has the syntax of adding an entry to a Python dictionary. An XY plot is then created using DataArray.
""" -*- mode: python -*- Create an NTuple with power-law distribution and histogram it on log log scale @author J. Chiang <jchiang@slac.stanford.edu> """ from load_hippo import app, canvas import random, math shoot = random.uniform # Create empty NTuple of 4 columns that we can fill by row. from hippo import NTuple, NTupleController nt = NTuple () nt.setLabels ( [ 'x' ] ) xmin = 1. xmax = 100. gamma = 2.1 xpmax = math.pow ( xmax, 1. - gamma ) xpmin = math.pow ( xmin, 1. - gamma ) nsamp = 10000 for i in range(nsamp): x = shoot(0, 1) xx = math.pow ( x*( xpmax - xpmin ) + xpmin, 1./ (1. - gamma) ) nt.addRow ( (xx, ) ) from hippo import Display hist = Display ( 'Histogram', nt, ( 'x', ) ) canvas.addDisplay ( hist ) # Set the x and y axis on log scale hist.setLog ( 'x', True ) hist.setLog ( 'y', True ) # fit to power law function from hippo import Function datarep = hist.getDataRep () powerlaw = Function ( "PowerLaw", datarep ) powerlaw.addTo( hist ) powerlaw.fit()
After creating of the histogram, the display is set to be a log-log plot. When the X axis is set to log scale, the width of the bins are logarithmically increasing in size. They only appear uniform in width on the screen. The function used is a power law, which appears as straight line on this log-log plot.
""" -*- python -*- This example demonstrates applying one cut to muliple displays. Author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ import hippo from load_hippo import app, canvas # # Gets the FITS controller which will do the reading of the file. # ntc = hippo.FitsController.instance() # # Get full path to FITS file in case this script is not run from this # directory # import sys full_path = sys.path[0] + '/' + 'SimEvents.fits' # # Find out what the names of the HDUs in the file. # hdus = ntc.getNTupleNames ( full_path ) print "Names of the Header/Data Units are..." print hdus events = ntc.createNTuple ( full_path, hdus[1] ) print "Names of the columns" labels = events.getLabels() print labels from hippo import Display # # Create a 1D histograms # hist1 = Display ( "Histogram", events, ('GLON', ) ) canvas.addDisplay ( hist1 ) hist2 = Display ( "Histogram", events, ('GLAT', ) ) canvas.addDisplay ( hist2 ) # # Create a cut # cut = hippo.Cut ( events, ( 'zenith_angle', ) ) canvas.addDisplay ( cut ) cut.setCutRange ( 80., 90., 'x' ) # # Add this cut to multiple displays ( a Python tuple of displays ). # cut.addTargets ( ( hist1, hist2 ) ) # if only one target than # >>> cut.addTarget ( hist1 ) # print "Added one cut to mulitple displays" print "Now try manipulating the cut range," print "use yellow buttons on the tools bar to select the mode."
Applying multiple cuts to a display is shown in the example below
""" -*- python -*- This example demonstrates applying mulitiple cuts to one display Author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ from load_hippo import app, canvas import hippo # # Gets the FITS controller which will do the reading of the file. # ntc = hippo.FitsController.instance() # # Get full path to FITS file in case this script is not run from this # directory # import sys full_path = sys.path[0] + '/' + 'SimEvents.fits' # # Find out what the names of the HDUs in the file. # hdus = ntc.getNTupleNames ( full_path ) print "Names of the Header/Data Units are..." print hdus events = ntc.createNTuple ( full_path, hdus[1] ) print "Names of the columns" labels = events.getLabels() print labels from hippo import Display # # Create a 1D histogram # hist = Display ( "Histogram", events, ('GLON', ) ) canvas.addDisplay ( hist ) # # Create a cut and set its range # from hippo import Cut cut1 = Cut ( events, ( 'zenith_angle', ) ) canvas.addDisplay ( cut1 ) cut1.setCutRange ( 80., 100., 'x' ) # # Create a another cut and set its range # cut2 = Cut ( events, ( 'energy', ) ) canvas.addDisplay ( cut2 ) cut2.setCutRange ( 1e3, 1e4, 'x' ) cut2.setLog ( 'x', True ) cut2.setLog ( 'y', True ) cuts = ( cut1, cut2 ) # # Add the list of cuts to display # ## datarep = hist.getDataRep() ## datarep.applyCuts ( cuts ) hist.applyCuts ( cuts ) # # If applying only one cut one would do # datarep.applyCut ( cut1 ) # print "Added multiple cuts to one display"
zenith = events['zenith_angle'] z_array = zenith > 80.
In the above, the numarray is created from the DataArray with label "zenith_angle". The second expression creates a numarray containing true
and false
values. One can use this Boolean array to create a cut. Thus one can have aribitrary complex cuts by combining Boolean expressions. An example of doing two cuts explicitly and with numarry is shown below.
""" -*- python -*- This example demonstrates applying a cut using numarray Author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ from load_hippo import app, canvas import hippo num = None try: import numpy as num except : import numarray as num # # Gets the FITS controller which will do the reading of the file. # ntc = hippo.FitsController.instance() # # Get full path to FITS file in case this script is not run from this # directory # import sys full_path = sys.path[0] + '/' + 'SimEvents.fits' # # Find out what the names of the HDUs in the file. # hdus = ntc.getNTupleNames ( full_path ) print "Names of the Header/Data Units are..." print hdus events = ntc.createDataArray ( full_path, hdus[1] ) # Set matrix of ploats canvas.setPlotMatrix ( 3, 3 ) from hippo import Display, Cut # # Create a 1D histogram # hist = Display ( "Histogram", events, ('GLON', ) ) canvas.addDisplay ( hist ) # # First create cuts individually # from hippo import Cut cut1 = Cut ( events, ( 'zenith_angle', ) ) canvas.addDisplay ( cut1 ) cut1.setCutRange ( 80., 100., 'x' ) # # Create a another cut and set its range # cut2 = Cut ( events, ( 'energy', ) ) canvas.addDisplay ( cut2 ) cut2.setCutRange ( 1e3, 1e4, 'x' ) cut2.setLog ( 'x', True ) cut2.setLog ( 'y', True ) cuts = ( cut1, cut2 ) # # Add the list of cuts to display # datarep = hist.getDataRep() datarep.applyCuts ( cuts ) hist.applyCuts ( cuts ) # # If applying only one cut one would do # datarep.applyCut ( cut1 ) # print "Added multiple cuts to one display" # # Now we use nummary's boolean arrays to create cut # hist2 = Display ( "Histogram", events, ('GLON', ) ) canvas.addDisplay ( hist2 ) # first cut zenith = events['zenith_angle'] za = num.logical_and ( zenith > 80., zenith < 100. ) # second cut energy = events['energy'] ec = num.logical_and ( energy > 1000., energy < 10000. ) # create cut and apply it aand = num.logical_and (za, ec) cutc = Cut ( events, 'complex cut', aand , hist2 ) canvas.addDisplay ( cutc ) cutc.setLog ( 'y', True ) print "Used Boolean numarrays to create cut"
""" -*- python -*- This script demonstrates the creation of a displays with multiple data reps Author: Paul_Kunz@slac.stanford.edu $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ from load_hippo import app, canvas from hippo import NTupleController # Create NTuple with its's controller. ntc = NTupleController.instance() # # create full path to the file in case this script is not run from this # directory # import sys full_path = sys.path[0] + '/' + 'aptuple.tnt' nt = ntc.createNTuple ( full_path ) from hippo import Display # Create a histogram hist = Display ("Histogram", nt, ('Age', ) ) canvas.addDisplay( hist ) # Overlay another histogram. hist.addDataRep ( 'Histogram', nt, ['Service'] ) # # Set the line style and color # reps = hist.getDataReps () from hippo import Line print "The available line styles are ..." print Line.values print "" try : reps[1].set ( Line.dash ) except RuntimeError, detail : print detail print "done" # # Set the color # from hippo import ColorValue print "The available color values are ..." print ColorValue.values print "" reps[1].set ( ColorValue.red ) reps[0].setBinWidth ( 'x', 1. ) reps[1].setBinWidth ( 'x', 2. ) full_path = sys.path[0] + '/' + 'Z0Strip.data' nt2 = ntc.createNTuple ( full_path ) # Create a strip chart display. strip = Display ( "Strip Chart", nt2, ('Energy', 'Sigma' )) canvas.addDisplay ( strip ) # Overlay strip chart with an XY plot. strip.addDataRep ( "XY Plot", nt2, ['Energy', 'Sigma', 'nil', 'error'] ) # # Get the second data representation to set its symbol and size # reps = strip.getDataReps () from hippo import Symbol print "The available symbols are ..." print Symbol.values print "" print "about to try" try : reps[1].set ( Symbol.opencircle ) print "set symbol" reps[1].setSize ( 8. ) except RuntimeError, detail : print detail # Create 2D histogram as contour plot. color = Display ( "Contour Plot", nt, ('Age', 'Cost' )) color.setBinWidth ( 'x', 1.0 ) canvas.addDisplay ( color ) print "created contour display" # Overlay it with 1D profile plot. color.addDataRep ( 'Profile', nt, [ 'Age', 'Cost' ] ) print "Overlayed various kinds of data representations"
With HippoDraw one can overlay data representations of different kinds. One can even overlay two histograms with different bin widths as is shown in this example. One can also change the line style or symbol type as well as the color. How to do these is easily seen in the Python code.
""" -*- python -*- This script adding functions and fitting. It also demonstrates retreiving an ntuple from the histogram to do something with its contents. Author: Paul_Kunz@slac.stanford.edu $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $nt """ import sys from load_hippo import app, canvas # # Create NTuple with its controller so Inspector can see it. # from hippo import NTupleController ntc = NTupleController.instance() # # Create full path to example file in case this script is not run from # this directory. # full_path = sys.path[0] + '/' + 'aptuple.tnt' nt1 = ntc.createNTuple ( full_path ) canvas.setPlotMatrix ( 2, 3 ) from hippo import Display hist = Display ( "Histogram", nt1, ("Cost", ) ) canvas.addDisplay( hist ) # Get the data representation so we can add function to it. datarep1 = hist.getDataRep() from hippo import Function gauss = Function ( "Gaussian", datarep1 ) gauss.addTo ( hist ) # Get the function parameters and display them. print "Before fitting" parmnames = gauss.parmNames ( ) print parmnames parms = gauss.parameters ( ) print parms # Now do the fitting. gauss.fit ( ) print "After fitting" parms = gauss.parameters ( ) print parms # Add another function. gauss1 = Function ( "Gaussian", datarep1 ) gauss1.addTo ( hist ) # Do another fit, should fit to linear sum gauss1.fit () # Add Chi-squared per d.f. display canvas.addTextRep ( hist, 'Chi-squared' ) # Create an NTuple from the histogram. # Calculate the residuals result = hist.createNTuple () ntc.registerNTuple ( result ) coords = result.getColumn ( 'Cost' ) values = result.getColumn ( 'Density' ) res = [] for i in range ( result.rows ) : x = coords[i] diff = values[i] - gauss1.valueAt ( x ) res.append ( diff ) # Add a column and display it. result.addColumn ( 'residuals', res ) resplot=Display ( "XY Plot", result, ( 'Cost', 'residuals', 'nil', 'Error' ) ) canvas.addDisplay ( resplot ) print "The histogram was fitted to the sum of two gaussians." print "Then histogram bins were retrieved to calculate " print "the residuals. These were then plotted as an XY Plot." print 'One could have used the "Create residuals display" button on the' print "Inspector, but that wouldn't have demonstrated anything."
After creating an NTuple from a ASCII file, a histogram is placed on the canvas. Then the data representation of it is retrieved, a function created and added to the representation. Next the function parameter names are retrieved along with the initial parameter values and printed. The parameter values are printed again after fitting. A second Gaussian function is added and the sum of the two is used for a fit. Then the chi-squared per degree of freedom is add to the plot.
The last part of the example illustrates retrieving information from the histogram. One asks the histogram to make an NTuple representation of itself. The NTuple is registered with the NTupleController so it will be visible from the Inspector. . One can thus access the histogram information by column. In the example, the residuals are calculated. The list of residuals are then added to the NTuple created by the histogram. Since this NTuple is of the same type as used to hold the initial data set, we can use it to make an XY plot as is shown in the example.
Everything done in this script could have been done with Inspector Users Guide. This includes creating of the residuals display as there is a button on the Function inspector to do this.
""" Demonstrates making simple XY plot. author Paul F. Kunz <Paul_Kunz@slac.stanford.edu> """ # # load the HippoDraw module # from load_hippo import app, canvas from hippo import Display # Create list of data energy = [90.74, 91.06, 91.43, 91.50, 92.16, 92.22, 92.96, 89.24, 89.98, 90.35] sigma = [ 29.0, 30.0, 28.40, 28.80, 21.95, 22.90, 13.50, 4.50, 10.80, 24.20] errors = [ 5.9, 3.15, 3.0, 5.8, 7.9, 3.1, 4.6, 3.5, 4.6, 3.6] # make a plot to test it. xy = Display ( "XY Plot", [ energy, sigma, errors ], ['Energy', 'Sigma', 'nil', 'error' ] ) canvas.addDisplay ( xy ) xy.setTitle ( 'Mark II Z0 scan' ) print "An XY plot is now displayed. You can use the Inspector dialog" print "to modify the appearance or fit a function to it."
After creating three Python lists of the data, one creates the display. The first argument to Display is the type of plot. The types available can be seen in the Data tabbed panel of the inspector. The second argument is a list of the data lists. This will have the effect of creating a NTuple and it will be visible from the Inspector. The third argument is also a Python list and it serves two purposes. It gives labels for the NTuple columns and it sets the binding options for the XY plot. The 3rd binding option is the X error, which is optional, so we use the "`nil'" string to indicate that binding is not to be used.
""" -*- python -*- This script demonstrates various types of Displays. Author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ from load_hippo import app, canvas from hippo import NTupleController # # Get full path to examplefile in case this script is not run from this # directory. # import sys full_path = sys.path[0] + '/' + 'aptuple.tnt' # Read an ASCII NTuple file to create an NTuple. ntc = NTupleController.instance() nt1 = ntc.createNTuple ( full_path ) from hippo import Display # Change the matrix from the default canvas.setPlotMatrix ( 3, 4 ) # Create the displays. hist = Display ("Histogram", nt1, ('Cost', ) ) canvas.addDisplay( hist ) profile = Display ( "Profile", nt1, ('Age', 'Cost' )) canvas.addDisplay ( profile ) color = Display ( "Color Plot", nt1, ('Age', 'Cost' )) color.setBinWidth ( 'x', 1.0 ) canvas.addDisplay ( color ) contour = Display ( "Contour Plot", nt1, ('Age', 'Cost' )) contour.setBinWidth ( 'x', 1.0 ) canvas.addDisplay ( contour ) profile2D = Display ( "Profile 2D", nt1, ('Age', 'Cost', 'Service' )) canvas.addDisplay ( profile2D ) profileC = Display ( "Profile Contour", nt1, ('Age', 'Cost', 'Service' )) canvas.addDisplay ( profileC ) scatter = Display ( "Scatter Plot", nt1, ('Age', 'Cost' )) canvas.addDisplay ( scatter ) # # Get full path to examplefile in case this script is not run from this # directory # full_path = sys.path[0] + '/' + 'Z0Strip.data' nt2 = ntc.createNTuple ( full_path ) strip = Display ( "Strip Chart", nt2, ('Energy', 'Sigma' )) canvas.addDisplay ( strip ) xy = Display ( "XY Plot", nt2, ('Energy', 'Sigma', 'nil', 'error' )) canvas.addDisplay ( xy ) y = Display ( "Y Plot", nt2, ('Energy', ) ) canvas.addDisplay ( y ) xyz = Display ( "XYZ Plot", nt1, ('Age', 'Service', 'Cost' )) canvas.addDisplay ( xyz ) vm = Display ( "Variable Mesh", nt2, ( 'Energy', 'Sigma', 'Sigma', 'binsize', 'error' ) ) canvas.addDisplay ( vm ) print "Various type of displays were created"
The beginning of the script sets the form factor of how the plots will be displayed, name 3 across and 4 down. This was done so they would all fit on one page. Nothing else is new compared to the previous examples.
""" -*- mode:python -*- Demo of reading ROOT file with function, cuts, and calculation This examples depends on a ROOT file not included in the distribution because of its size. It can be downloaded from here... ftp://ftp-glast.slac.stanford.edu/glast.u11/InstrumentAnalysis/MC/EngineeringModel-v5r0608p7/2Tower_surface_muons/surface_muons_4M_merit.root author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ from hippo import RootController, HDApp, Display rc = RootController.instance() filename = "/nfs/farm/g/glast/u33/InstrumentAnalysis/MC/EngineeringModel-v6r070329p28/Surface_muons/surface_muons_4M_merit.root" # Get the top level tree names from the file ntuple_names = rc.getNTupleNames ( filename ) print "In this file, tree names are ", ntuple_names # Create a data array from the first tree ntuple = rc.createDataArray ( filename, 'MeritTuple' ) # Pprint some information about this tree print "Number of columns = ", ntuple.columns labels = ntuple.getLabels() print "First ten column labels are ... ", labels[:10] print "Number of rows = ", ntuple.rows # Create a histogram for one of the columns and add it to the canvas hist = Display ( "Histogram", ntuple, ('TkrEnergy', ) ) # Up to now, we didn't need the HippoDraw application running. # Now we do in order to view the data. app = HDApp() canvas = app.canvas() canvas.addDisplay ( hist ) # Set the Y axis on log scale of better viewing hist.setLog ( 'y', True ) # Add a cut from data in another column from hippo import Cut hits_cut = Cut ( ntuple, ('TkrTotalHits',) ) canvas.addDisplay ( hits_cut ) hits_cut.setLog ( 'y', True ) hits_cut.addTarget ( hist ) hits_cut.setCutRange ( 4, 110, 'x' ) # Change the range of the displayed data hist.setRange ( 'x', 40, 700 ) # fit a function to the histogram from hippo import Function datarep = hist.getDataRep () exp1 = Function ( "Exponential", datarep ) exp1.addTo ( hist ) exp1.fit () # Print the results of the fit pnames = exp1.parmNames () print pnames parms = exp1.parameters () print parms # add another function to the histogram and fit the linear sum exp2 = Function ( "Exponential", datarep ) exp2.addTo ( hist ) exp1.fit() # always fit to linear sum # Build an array which is the sum of two columns and add it to the ntuple label = "Raw sum" ntuple [ label ] = ntuple [ 'TkrEnergy' ] + ntuple [ 'CalEnergyCorr' ] # Histogram the new column sum_hist = Display ( 'Histogram', ntuple, (label, ) ) canvas.addDisplay ( sum_hist ) sum_hist.setLog ( 'x', True ) sum_hist.setLog ( 'y', True ) # Compare it with the official sum of energies merit_sum = Display ( 'Histogram', ntuple, ( 'EvtEnergyCorr', ) ) canvas.addDisplay ( merit_sum ) merit_sum.setLog ( 'x', True ) merit_sum.setLog ( 'y', True ) sum_hist.setRange ( 'x', 1.0, 1e+06 )
One uses the single instance of the RootController object to read the file. First one gets the names of the top level TTree which HippoDraw interprets as NTuple names.
In the next step, one could have created an NTuple with the method createNTuple and the same arguments. It would be implemented by the RootNTuple class. This class only reads the data into memory as it is needed, so it is quite efficient even for files that have a large number of columns.
Instead of a NTuple, one creates a DataArray. It contains a NTuple and behaves like one for most purposes but gives us additional flexibility as will be seen near the end of the example.
The next few lines shows how to get the number of rows and columns as well as the labels for the columns. The script then creates a histogram and adds it to the canvas. Since the distribution falls off rather quickly, the Y axis is set to a log scale.
The script then applies a cut and adds the display of the cut variable to the canvas. The cut distribution also falls rather quickly to its Y axis is also put on a log scale.
The next step gets a handle on the data representation, i.e. the actual histogram, in order to fit a function to it. Not satisfied with the fit, a second function is applied. Note that when two functions are applied, a fit is done to the linear sum of the functions.
The last part of the script shows the power of using a DataArray. In one line the script adds two columns together, row by row, and adds the result to the DataArray. Let's take it step by step to understand it. The expression
ntuple [ 'TkrEnergy' ]
shows that the DataArray behaves like a Python dictionary in that we can access a column by a key. The key is the label of the column. It returns a numarray. The expression
ntuple [ 'TkrEnergy' ] + ntuple [ 'CalEnergySum' ]
thus accesses two columns and adds them together because the numarray class overload the operator + to create a new numarray whose values are the sum of the others, element by element.
Finally, a column is added with the expression
ntuple [ label ] =
where again the DataArray behaves like a Python dictionary. The right hand side of the assignment must be a numarray, which it is in this case. The column is not added to the ROOT file, it is only logically added in memory.
Treating columns of the data source as vectors and doing vector algebra on them can be considerably faster than writing for loops. The code is much easier to write and understand as well.
The example code is shown below. The explanation of the code is in the comments.
""" -*- mode:python -*- Demo of reading ROOT SVAC data set a with numarrray and adding variables to your own ntuple. Numarray functions used in this script are described here... <http://stsdas.stsci.edu/numarray/numarray-1.5.html/node21.html> and node22.html and node38.html author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu """ # # Get numarrays: # import numarray # # Get canvas: # import hippo app = hippo.HDApp() canvas = app.canvas () # # HippoDraw Root interface: # from hippo import RootController, Display rc = RootController.instance() # # SVAC ntuple we want to look at: # filenameSvac = "/nfs/farm/g/glast/u27/Integration/rootData/135005389/v5r0703p6/calib-v1r0/svacTuple/emRootv0r0/svacTuple-v3r4p4_135005389_svac_svac.root" print "Opening SVAC file ", filenameSvac print "Ignore the warning and error messages which come from ROOT" # # SVAC Ntuple tree. The ROOT file could have multiple trees, we # probably want the first one. # tree_names = rc.getNTupleNames ( filenameSvac ) print "In the SVAC file the tree names are: ", tree_names nameSvac = tree_names[0] # # Get the SVAC DataArray: It is called 'daSvac' in this script # The dimensinality will be # number of columns = number of TBranch's # number of rows = number of events # daSvac = rc.createDataArray ( filenameSvac, nameSvac ) # # Read in the trigger vectors. The `GemTkrVector' variable in this column # is a vector of size 16 # # The first method we use a boolean expression, and then convert it to # integer array. The second method we use the where function # # The dimensions of the resulting numarrays are # number of rows = number of events # number of columns = size of the vector (16) # print "Verify shape of ROOT array variable" na = daSvac['GemTkrVector'] print na.shape import time tstart = time.time() print "start" TkrTriggered = ( daSvac [ 'GemTkrVector'] == 1).astype(numarray.Int32) print "Verify shape of result of vector expression" print TkrTriggered.shape CalLeTriggered = numarray.where ( daSvac ['GemCalLeVector'] == 1, 1, 0 ) # # The labels of the columns we will add to the data set # label_TkrTriggered = 'Number of Towers with Tkr Triggers' label_CalLeTriggered = 'Number of Towers with CalLe Triggers' label_TowerTkrTrigGemCond = 'Number of Towers with Tkr Triggers after Cut' label_TowerCalLeTrigGemCond = 'Number of Towers with CalLe Triggers after Cut' # # For each event, take the sum of the columns. # Second argument in sum() is 0 for sum of row and 1 for column # Then add resulting vector as a new column to the data set # nbrTkrTriggered = numarray.sum ( TkrTriggered, 1 ) print "Verify shape of result of sum" print nbrTkrTriggered.shape daSvac [ label_TkrTriggered ] = nbrTkrTriggered nbrCalLeTriggered = numarray.sum ( CalLeTriggered, 1 ) daSvac [ label_CalLeTriggered ] = nbrCalLeTriggered # # Get the branch as an array and create a numarray withere the value # is equal to 7 # t = daSvac [ 'GemConditionsWord' ] == 7.0 # # If GemConditionsWord == 7., then fill element with `nbrTkrTriggered', # otherwise with -1 # daSvac [ label_TowerTkrTrigGemCond ] = \ numarray.choose ( t, ( nbrTkrTriggered, -1 ) ) daSvac [ label_TowerCalLeTrigGemCond ] = \ numarray.choose ( t, ( nbrCalLeTriggered, -1 ) ) tend = time.time() print "Took %f seconds create the 4 new columns with 500,000 rows each" % \ (tend -tstart) # # The rest is standard procedure # tkrtrighist = Display ("Histogram", daSvac, (label_TkrTriggered,) ) canvas.addDisplay ( tkrtrighist ) tkrtrighist.setLog ( 'y', True) calletrighist = Display ("Histogram", daSvac, (label_CalLeTriggered, ) ) canvas.addDisplay ( calletrighist ) calletrighist.setLog ( 'y', True) tkrtrighist_gemcond = Display ( "Histogram", daSvac, (label_TowerTkrTrigGemCond, ) ) canvas.addDisplay ( tkrtrighist_gemcond ) tkrtrighist_gemcond.setRange ( 'x', 0, 16) tkrtrighist_gemcond.setLog ( 'y', True) print tkrtrighist_gemcond.numberOfEntries()
The instance of the FitsController is used to read the FITS file. FITS files can contain multiple images and tables. You can use the FitsController to find the names of the Header/Data Units (HDU) as shown. The first HDU is always an image, so the tabular data is probably the second HDU. The FitsController creates DataArray object (see The DataArray Python class) from given the filename and the name of the HDU. The DataArray holds a FitsNTuple object that knows how to read columns from a ASCII or binary table.
The FitsController and FitsNTuple classes are written in C++ and use the standard CFITSIO package for reading the FITS file.
""" -*- mode:python -*- Demonstrate using FitsController to create a new FITS file author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> """ from hippo import FitsController, Display fc = FitsController.instance() import sys fullname = sys.path[0] + '/' + "SimEvents.fits" print "Opening file ", fullname hdu_names = fc.getNTupleNames ( fullname ) print "Names of the HDUs are ", hdu_names darray = fc.createDataArray ( fullname, hdu_names[1] ) # pass the empty image print "Number of columns = ", darray.columns print "Column labels are ..." print darray.getLabels() print "Number of rows = ", darray.rows import hippo app = hippo.HDApp() canvas = app.canvas() plot = Display ( "Color Plot", darray, ('GLON', 'GLAT', 'nil', 'nil' ) ) canvas.addDisplay ( plot ) # # Calculate Log(energy) and add it as new column # import numarray darray[ 'LogE' ] = numarray.log ( darray [ 'energy' ] ) lplot = Display ( 'Histogram', darray, ( 'LogE', ) ) lplot.setLog ( 'y', True ) canvas.addDisplay ( lplot ) # # Compare it with logrithmic binning # clplot = Display ( 'Histogram', darray, ( 'energy',) ) canvas.addDisplay ( clplot ) clplot.setLog ( 'x', True ) clplot.setLog ( 'y', True ) # # Apply a cut to the displays # cut = hippo.Cut ( darray, ( 'time', ) ) canvas.addDisplay ( cut ) cut.setCutRange ( 5600., 14000., 'x' ) cut.addTargets ( (plot, lplot, clplot) ) # # Add array object to column # glon = darray['GLON'] glat = darray['GLAT'] a2d = numarray.array ( [ glon, glat ] ) a2d.transpose() print "Now have" print "rank = ", a2d.rank print "shape = ", a2d.shape print a2d darray [ 'glon-glat' ] = a2d darray.setTitle ( 'Lat events modified' ) print "\nCurrent columns are" print darray.getLabels() # # Write the modified FITS binary table to a file # fc.writeToFile ( darray, 'sim2.fits' ) # # Write to FITS file only rows passing cuts and only certain columns # columns = [ 'GLON', 'GLAT', 'energy', 'time', 'LogE', 'glon-glat' ] fc.writeToFile ( darray, 'sim3.fits', (cut,), columns ) # # Verify the written file # names2 = fc.getNTupleNames ( 'sim3.fits' ) print "\nHDU names are ", names2 darray2 = fc.createDataArray ( 'sim3.fits', names2[1] ) print "\nColumn labels are", darray2.getLabels() # # Display data, should appear same as original under cut # plot2 = Display ( "Color Plot", darray2, ('GLON', 'GLAT', 'nil', 'nil' ) ) canvas.addDisplay ( plot2 ) lplot2 = Display ( 'Histogram', darray2, ( 'LogE', ) ) lplot2.setLog ( 'y', True ) canvas.addDisplay ( lplot2 ) # # Get the array column # glonglat = darray2 [ 'glon-glat' ] print "\ncheck the array variable" print "\Rank = ", glonglat.rank print "shape = ", glonglat.shape print "And the data is under the cut" print glonglat
The first part of the script shows how to get information about the file and its tables. it than creates a two dimensional histogram and adds it to the canvas. Next a column is added to the DataArray which is the log10 of the column containing `energy' and a plot of it is added to the canvas, with the Y axis on a log scale. For comparison, another plot of energy is added to the canvas with both the X and Y axes on a log scale. For histograms, this also sets the bin widths to a logrithmically increasing size. Thus, these last two plots should look the same.
The script next creates a cut on time and sets it to an arbitrary range. The cut itself is placed on the canvas. The cut is then applied to all three plots.
Next, a simple vector is added as a column. using numarray. It is added to DataArray in the same manner as any other array.
The script then writes the DataArray, including the added columns, to the FITS file sim2.fits. This has the same effect as Export text ntuple menu item of the CanvasWindow. Although this DataArray contains a FitsNTuple, and type of data source could have been used.
The script then creates another FITS file from the DataArray, but only with the rows that pass the cut and the selected columns. This has the same effect as the Create NTuple... menu item of the CanvasWindow.
The last part of the script reads the newly created FITS file and checks if it contains what is expected.
The example is shown below. It also illustrates use of numarray to do vector arithmatic.
""" -*- python -*- This script tests and demonstrates appending to NTuple Copyright (C) 2006 The Board of Trustees of The Leland Stanford Junior University. All Rights Reserved. Author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ # # Import hippo and create application and canvas # from load_hippo import app, canvas from hippo import * num = None d_type = None # # Set array module and double type for it # try: import numarray as num d_type = num.double except: import numpy as num d_type = 'd' # # Create empty arrays # da1 = DataArray () da2 = DataArray () # # labels of columns that will be used often # data = 'data' sqdata = 'sqrt(data)' # # Create and fill two columns # da1 [ data ] = num.array ( range ( 10), d_type ) da1 [ sqdata ] = num.sqrt ( da1 [ data ] ) # # Plot them # xyplot = Display ( 'XY Plot', da1, ( data, sqdata, 'nil', 'nil' ) ) canvas.addDisplay ( xyplot ) # # Create a similar array # da2 [ data ] = num.array ( range ( 10, 20, 1 ), d_type) da2 [ sqdata ] = num.sqrt ( da2 [ data ] ) # # Append second to first # da1.append ( da2 ) print "Should see 20 data points, the result of appending two DataArrays"
If you are new to Python and/or numarray, this line in the example you may find confusing
da1 [ 'data' ] = numarray.array ( range ( 10 ) )
What is happening here is that the range function returns a Python list object. It is then used to construct a numarray object. The resulting numarray is added to the data array as a new column with the label data.
This example also illustrates how to directly manipulate the fitting process, using various components. In future releases, some of these components could also be written in Python.
Finally, this example uses a DataArray to hold the data set.
""" -*- mode:python -*- Demonstrates implementing a function in Python derived from the C++ abstract base class. Also demonstrates use of the fitter from Python, without using the GUI @author Paul F. Kunz <paul_kunz@slac.stanford.edu> """ # #$Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ # from load_hippo import app, canvas import random num = None try: import numpy as num d_type = 'd' except: import numarray as num d_type = num.double slope = 2. intercept = 1. sigma = 2. from hippo import FunctionBase # # Define a class in Python that is derived from C++ base class # class Linear ( FunctionBase ) : def __init__ ( self ) : FunctionBase.__init__( self ) self.initialize () def initialize ( self ) : self.setName ( 'linear' ) self.setParmNames ( [ 'Intercept', 'Slope' ] ) self.setParameters ( [ intercept, slope ] ) def dimensions ( self ) : return 1 def valueAt ( self, x ) : parms = self.getParameters () return x * parms[1] + parms[0] def derivByParm ( self, i, x ) : if i == 0 : return 1.0 if i == 1 : return x gauss = random.gauss # Generate some data using numarray npts = 20 x = num.array(num.arange(npts), d_type) y = num.array( [gauss(slope*xx + intercept, sigma) for xx in x] ) xerr = num.ones ( npts ) yerr = sigma*num.ones(npts) # Create a DataArray and register it so that the Inspector can display it. # The DataArray is will wrap a DataSource of type NTuple. from hippo import DataArray da = DataArray ( 'NTuple' ) # use NTuple to store data da.register ( 'randomized line' ) # name the data source # Fill the contents by adding named columns in the order expected by # the fitter (more later) da['x'] = x da['y'] = y da['xerr'] = xerr da['yerr'] = yerr # Create an XY plot, ignoring xerr, and add it to the canvas. from hippo import Display disp = Display ('XY Plot', da, ('x', 'y', 'nil', 'yerr' ) ) canvas.addDisplay ( disp ) # Get the name of the available fitters from the factory from hippo import FitterFactory factory = FitterFactory.instance() fitters = factory.names() print fitters # Create a fitter, in this case Minuit Migrad algorithm with Chi Sq migrad = factory.create ( fitters[1] ) print migrad.name() # create a function f = Linear () print f.name() print f.parmNames() # Get the FCN so we can set it up fcn = migrad.getFCN () fcn.setFunction ( f ) fcn.setDataSource ( da.dataSource() ) fcn.setUseErrors ( True ) # Since we are using Minuit, we can set limits on the parameters. If # we wanted to do that we would uncomment the following. #migrad.setLimits ( 0, -2., -1. ) # Minimize the Chi Sq migrad.minimize() # Print the resulting parameters print f.getParameters() # Add the function to the display of the data fl = Linear () disp.addFunction ( f ) # Add the function to the factory so it can be used from the GUI # One might just do this instead of using the minimizers directly from hippo import FunctionFactory ff = FunctionFactory.instance() ff.add ( f ) # Create an DataArray with columns not in order expected by the fitter da2 = DataArray ( 'NTuple' ) # use NTuple to store data da2.register ( 'randomized line 2' ) # name the data source # Fill the contents by adding named columns in the order expected by # the fitter (more later) da2['y'] = y da2['x'] = x da2['yerr'] = yerr da2['xerr'] = xerr # Teach the FCN the columns to use, ignoring xerr. The `1' indicates # the dimension of the coordinate space fcn.setDataSource ( da2.dataSource(), 1, [ 1, 0, -1, 2 ] ) f.setParameters ( [ intercept, slope ] ) migrad.minimize () print f.getParameters ()
This example also illustrates how to directly manipulate the fitting process, using features, some of which are only available with the Minuit fitter.
""" -*- mode:python -*- This example shows how to do 2 dimension fitting by talking to the fitter and its FCN directly. It also shows the optional setting for the fitting process that are only available with a Minuit based fitter. author Paul F. Kunz <Paul_Kunz@slac.stanford.edu> $Id: examples_page.html,v 1.40 2008/03/24 19:27:26 pfkeb Exp $ """ import random num = None d_type = None try: import numpy as num d_type = 'd' except: import numarray as num d_type = num.double import hippo from hippo import FunctionBase # # Declare a function that take is 2 dimensional. # class Planar ( FunctionBase ) : def __init__ ( self, other = None ) : if other : FunctionBase.__init__( self, other ) else : FunctionBase.__init__( self ) self.initialize () def initialize ( self ) : self.setName ( 'Planar' ) self.setParmNames ( [ 'Interscept', 'SlopeX', 'SlopeY' ] ) self.setParameters ( [ 0., 1., 1. ] ) # initial parameters def valueAt ( self, x, y ) : parms = self.getParameters () return parms[0] + parms[1] * x + parms[2] * y # # Generate some data. # slope = 2. intercept = 1. mean = 10. sigma = 4. gauss = random.gauss def plane ( x, y ) : return x + y npts = 20 x = num.arange(npts*npts) y = num.arange(npts*npts) z = num.arange(npts*npts) zerr = num.arange(npts*npts) k = 0 for i in range(npts) : for j in range(npts) : x[k] = xx = i y[k] = yy = j z[k] = plane ( xx, yy ) + gauss ( 0., 2. ) zerr[k] = sigma k += 1 # # Store the data in NTuple # nt = hippo.NTuple () # empty ntuple nt.addColumn ( 'x', x ) # add column nt.addColumn ( 'y', y ) nt.addColumn ( 'z', z ) nt.addColumn ( 'zerr', zerr ) # # Get the list of available fitters so one can verify that Minuit is # available. # from hippo import FitterFactory factory = FitterFactory.instance() fitters = factory.names() print "Show the fitters available" print fitters # # Create a Minuit Migrad fitter # migrad = factory.create ( fitters[1] ) # # Create an instance of our custom function and verify its names # f = Planar () print "Show the functin name and its parameter names" print f.name() print f.parmNames() # # Get the objective function so we can set the function and data to be used # fcn = migrad.getFCN () fcn.setFunction ( f ) # # Tell the FCN how to bind to the NTuple # columns are x y z zerr fcn.setDataSource ( nt, 2, [ 0, 1, 2, -1, -1, 3 ] ) # no xerr or yerr fcn.setUseErrors ( False ) # do not use errors on z value migrad.minimize() chisq = fcn.objectiveValue () print "Show ChiSq after minimization with no Z error" print "ChiSq =", chisq print "Degrees of Freedom = ", fcn.degreesOfFreedom() fcn.setUseErrors ( True ) # use error on z value. chisq = fcn.objectiveValue () print "Show ChiSq after minimization using Z error" print "ChiSq =", chisq print "Degrees of Freedom = ", fcn.degreesOfFreedom() # # Parameters are returned as a tuple, so # get a copy of thie parameter values by creating a list. # parms = list(f.getParameters()) print "Show the parameter values" print parms # # Set a parameter value and have it fixed in the fit # parms[1] = 1.0 f.setParameters ( parms ) migrad.setFixedFlags ( ( 0, 1, 0 ) ) migrad.minimize () print "Show parameter values with 2nd one fixed" print f.getParameters() # # Try setting limits, only available with Minuit fitter # migrad.setLimits ( 2, 1.1, 1.2 ) migrad.minimize() print "Show parameter values after setting limits on 3rd one." print f.getParameters () print "There's no need to display the data"
The Python code is shown below.
""" -*- mode:python -*- Demo of doing a Fast Fourier Transform on simulated white noise. author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu> """ from load_hippo import app, canvas import hippo, hippofft num = None try: import numpy as num import numpy.random as ra except : import numarray as num import num.random_array as ra canvas.setPlotMatrix ( 1, 3 ) # # generate a random ordered sequence # npts = 400000 times = num.sort ( ra.uniform ( 1e9, 1e10, (npts, ) ) ) ds = hippo.NumArrayTuple () ds.addColumn ( 'times', times ) # # Use the hippo FFT utility to get binned light curve display the # normalized contents of the bins # lightcurve, bintimes = hippofft.simpleLightCurve( ds,'times', numberbins=4096 ) canvas.addDisplay ( lightcurve ) # # Calculate the power spectrum by giving hippo the range and unit of # time # range = lightcurve.getRange ( 'x' ) onetick = 1. / 20.0e6 spectrum = hippofft.powerSpectrum ( bintimes, range, onetick ) # # The spectrum return is ntuple of two columns which can be used # directly to plot it as an XY plot # labels = spectrum.getLabels() xyplot = hippo.Display ( 'XY Plot', spectrum, labels ) canvas.addDisplay ( xyplot ) # # Check that the power distribution is correct by first histogramming # it # hist = hippo.Display ( 'Histogram', spectrum, (labels[1], ) ) hist.setLog ( 'y', True ) canvas.addDisplay ( hist ) # # and then fitting it to an exponential # datarep = hist.getDataRep() exp = hippo.Function ( 'Exponential', datarep ) exp.addTo ( hist )
The first part generates white noise time series in atribitary units. We create a NumArrayTuple to holds the data as a one column DataSource. This type of DataSource holds a reference to the data instead of making a copy of it.
Then the hippofft.simpleLightCur function is calledwith the data source and the label of the (only) column. The last srgument is optionsal and is the number of bins to be used to generate the light curve. The default is 1024. The function returns two objects. The first is the histogram of the data. It can be added to the canvas as shown. The second is the contents of the bins appropriately normalized.
The power spectrum is then calculated given the bins, the range of the data, and the units of time in seconds. The function returns a DataArray containing an NTuple object with two columns. The first column is the frequency and the second is the power. One can create a XY plot of the spectrum as shown.
To check that hte transform is correct, the script creates a histogram of the power and plots it on a log scale. It then fits it to an exponential. The resulting fit should havethe scale factor around 2.0.