Python plots with no display

I have been working on some offline processing of data and creating graphs on the fly which automatically get updated on a website. What has been problematic is to do this without a display (for example run from a cron job). I found a solution which seems to work with the EPD package I am using on a linux box.

[cc lang="python"]
from matplotlib.figure import Figure
from matplotlib.backends.backend_agg import FigureCanvasAgg

fig = Figure(figsize=(4,4))
fig.gca().plot(range(1,10))
canvas=FigureCanvasAgg(fig)
canvas.print_figure(‘bob.png’, dpi=150)
[/cc]

There are likely some other ways to do it, but this works for me.

Reading Raw Data in Python

In a similar vein to reading raw data into Matlab, I created a similar type of function in Python:

[cc lang="python"]
def readraw(filename, shape, intype=’int16′, byteSwap=False):
“”" readraw – To read in a raw file and reformat it to the right shape “”"

# Read in the file
if filename.endswith(‘gz’):
fp = gzip.open(filename, ‘rb’)
else:
fp = open(filename, ‘rb’)

d = fromfile(file=fp, dtype=intype).reshape(shape)

d.byteswap(byteSwap)

return d
[/cc]

Anisotropic Diffusion Image Filtering in MRI

Background

Magnetic resonance imaging has the tradeoff of signal-to-noise vs time vs resolution.  You can only choose two. For some applications it may be better to get higher temporal and spatial resolution than signal-to-noise and then one may do some spatial filtering.  Simple filtering would be applying a median filter or Gaussian smoothing over the image (or volume).  But there are better techniques.

Smarter Filtering

One option for a smarter filter is the anisotropic diffusion filter which was first introduced to MRI in 1992 ((G. Gerig et al., “Nonlinear anisotropic filtering of MRI data,” Medical Imaging, IEEE Transactions on 11, no. 2 (1992): 221-232. )).  The basic idea is given a central voxel in a kernel and an estimation of noise the surrounding voxels are included in the smoothing based on the difference in signal to the central voxel relative to the estimation of noise.

I wrote a paper on this technique applied to multi-echo data ((Craig K Jones, Kenneth P Whittall, and Alex L MacKay, “Robust myelin water quantification: averaging vs. spatial filtering,” Magnetic Resonance in Medicine: Official Journal of the Society of Magnetic Resonance in Medicine / Society of Magnetic Resonance in Medicine 50, no. 1 (July 2003): 206-209)).

There is a fine line between filtering and over-filtering. That is a whole separate discussion.

The images below are a single slice of an MPRAGE image without filtering (left) and with anisotropic diffusion filtering (right). The bottom set are just zoomed in versions of the top. The filtered data might be slightly over filtered but was done to show the affect of the filter.

Code

Matlab

The version below is for a 3D dataset:
[cc lang="matlab"]

function [filt_vol] = aniso3d(orig_vol, kappa, niters)

if( nargin < 3 )
error(‘aniso3d: Need more parameters’);
end

filt_vol = orig_vol;

for iters = 1:niters

dE = convn(filt_vol, [0 -1 1], ‘full’); dE=dE(:,2:ncols(dE)-1,:);
dW = convn(filt_vol, [-1 1 0], ‘full’); dW=dW(:,2:ncols(dW)-1,:);
dN = convn(filt_vol, [0; -1; 1], ‘full’); dN=dN(2:nrows(dN)-1,:,:);
dS = convn(filt_vol, [-1; 1; 0], ‘full’); dS=dS(2:nrows(dS)-1,:,:);
kernel = zeros(1,1,3); kernel(2) = -1; kernel(3) = 1;
dU = convn(filt_vol, kernel, ‘full’); dU=dU(:,:,2:size(dU,3)-1);
kernel = zeros(1,1,3); kernel(1) = -1; kernel(2) = 1;
dD = convn(filt_vol, kernel, ‘full’); dD=dD(:,:,2:size(dD,3)-1);

filt_vol = filt_vol +  …
3/28 * ((double(exp(- (abs(dE) / kappa).^2 )) .* double(dE)) – (double(exp(- (abs(dW) / kappa).^2 )) .* double(dW))) + …
3/28 * ((double(exp(- (abs(dN) / kappa).^2 )) .* double(dN)) – (double(exp(- (abs(dS) / kappa).^2 )) .* double(dS))) + …
1/28 * ((double(exp(- (abs(dU) / kappa).^2 )) .* double(dU)) – (double(exp(- (abs(dD) / kappa).^2 )) .* double(dD)));
end
[/cc]

For 4D data one can also smooth across the 4th dimension (whether it is time, diffusion etc).
[cc lang="matlab"]
function [filt_vol] = aniso3d_chan(orig_vol, kappa, niters)
%
% aniso3d_chan – Run the anisotropic diffusion filter in 3D
% and over the multiple channels.
%

if( nargin < 3 )
error(‘aniso3d: Need more parameters’);
end

filt_vol = float(squeeze(orig_vol));

for iters = 1:niters
dE = convn(filt_vol, [0 -1 1], ‘full’); dE=dE(:,2:ncols(dE)-1,:,:);
cE = repmat(sqrt(sum(dE.^2, 4)), [1 1 1 size(dE,4)]);
filt_vol = filt_vol + 3/28 * ((exp(- (cE / kappa).^2 )) .* (dE));
clear cE;
clear dE;

dW = convn(filt_vol, [-1 1 0], ‘full’); dW=dW(:,2:ncols(dW)-1,:,:);
cW = repmat(sqrt(sum(dW.^2, 4)), [1 1 1 size(dW,4)]);
filt_vol = filt_vol – 3/28 * ((exp(- (cW / kappa).^2 )) .* (dW));
clear dW;
clear cW;

dN = convn(filt_vol, [0; -1; 1], ‘full’); dN=dN(2:nrows(dN)-1,:,:,:);
cN = repmat(sqrt(sum(dN.^2, 4)), [1 1 1 size(dN,4)]);
filt_vol = filt_vol + 3/28 * ((exp(- (cN / kappa).^2 )) .* (dN));
clear dN;
clear cN;

dS = convn(filt_vol, [-1; 1; 0], ‘full’); dS=dS(2:nrows(dS)-1,:,:,:);
cS = repmat(sqrt(sum(dS.^2, 4)), [1 1 1 size(dS,4)]);
filt_vol = filt_vol – 3/28 * ((exp(- (cS / kappa).^2 )) .* (dS));
clear cS;
clear dS;

kernel = zeros(1,1,3); kernel(2) = -1; kernel(3) = 1;
dU = convn(filt_vol, kernel, ‘full’); dU=dU(:,:,2:size(dU,3)-1,:);
cU = repmat(sqrt(sum(dU.^2, 4)), [1 1 1 size(dS,4)]);
filt_vol = filt_vol + 1/28 * ((exp(- (cU / kappa).^2 )) .* (dU));
clear dU;
clear cU;

kernel = zeros(1,1,3); kernel(1) = -1; kernel(2) = 1;
dD = convn(filt_vol, kernel, ‘full’); dD=dD(:,:,2:size(dD,3)-1,:);
cD = repmat(sqrt(sum(dD.^2, 4)), [1 1 1 size(dS,4)]);
filt_vol = filt_vol – 1/28 * ((exp(- (cD / kappa).^2 )) .* (dD));
clear dD;
clear cD;
end
[/cc]

Python

The Python code is very similar to the Matlab code above. It does 2D images or 3D volumes, but I have not coded the smoothing across the 4th dimension. That will have to be done later.
[cc lang="python"]
def aniso(v, kappa=-1, N=1):

if kappa == -1:
kappa = prctile(v, 40)

vf = v.copy()

for ii in range(N):
dE = -vf + roll(vf,-1,0)
dW = vf – roll(vf,1,0)

dN = -vf + roll(vf,-1,1)
dS = vf – roll(vf,1,1)

if len(v.shape) > 2:
dU = -vf + roll(vf,-1,2)
dD = vf – roll(vf,1,2)

vf = vf + \
3./28. * ((exp(- (abs(dE) / kappa)**2 ) * dE) – (exp(- (abs(dW) / kappa)**2 ) * dW)) + \
3./28. * ((exp(- (abs(dN) / kappa)**2 ) * dN) – (exp(- (abs(dS) / kappa)**2 ) * dS))
if len(v.shape) > 2:
vf += 1./28. * ((exp(- (abs(dU) / kappa)**2 ) * dU) – (exp(- (abs(dD) / kappa)**2 ) * dD))

return vf
[/cc]

Noise in MRI (magnitude) data

Background

Magnitude MRI data has Rician noise distribution by definition ((Hákon Gudbjartsson and Samuel Patz, “The Rician Distribution of Noisy MRI Data,” Magnetic resonance in medicine : official journal of the Society of Magnetic Resonance in Medicine / Society of Magnetic Resonance in Medicine 34, no. 6 (December 1995): 910-914)). It comes about because two channels each with Gaussian noise are squared and added together ((R M Henkelman, “Measurement of signal intensities in the presence of noise in MR images,” Medical Physics 12, no. 2 (April 1985): 232-233.)).  There is a longer description here.

Modeling

The Rician noise is created as $latex y_e(t_i) = \sqrt{ \left[y(t_i) + e_1 \right]^2 + e_2^2 }$, where $latex y$ is the true signal, and $latex e_1$ and $latex e_2$ are random numbers from a Gaussian distribution with zero mean and standard deviation $latex \sigma$.  The standard deviation, $latex \sigma$, for the Gaussian distribution is related to the signal to noise ratio and is typically on the order of 1% – 10% of the signal $latex y$.

Code

It is relatively easy to model this using Matlab or Python. For the code here I am modeling a T2 decay curve and then the noise.

Matlab

[cc lang="matlab"]

%  Setup the initial variables
rho = 100;
t2 = 80; % in ms
te = 10:10:320;  % in ms

%  Create a T2 decay curve
y = rho * exp(-te / t2 );

%  Define the noise to be 5% of the signal
s = 5;

%  Create the two Gaussian random variable vectors
e1 = s * randn(size(y));
e2 = s * randn(size(y));

%  Now create the new, noisy decay curve.
y_e = sqrt( (y+e1).^2 + (e2).^2 );

[/cc]

Python

The Python version is quite similar.

[cc lang="python"]

from __future__ import division

#  Setup the initial variables
rho = 100
t2 = 80 # in ms
te = r_[10:330:10] # in ms

#  Create a T2 decay curve
y = rho * exp( -te / t2 )

#  Define the noise to be 5% of the signal
s = 5;

#  Create the two Gaussian random variable vectors
e1 = normal(0, 5, y.shape)
e2 = normal(0, 5, y.shape)

#  Now create the new, noisy decay curve.
y_e = sqrt( (y+e1)**2 + (e2)**2 );

[/cc]

There are a couple of small gotcha’s that at least tripped me up as I am still relatively new to Python.

  1. The first is that under Python 2.x all data is processed as integer (not doubles, as the default is in Matlab).  Supposedly this is going to change in Python 3, but to get around it for now, the best thing to do is to add the [cci lang="python"]from __future__ import divison[/cci].
  2. To define [cci lang="python"]te[/cci] I had to go to 330, rather than 320 as the generator is an open set on the higher end so it does not include the number.
  3. There are several options for creating the random numbers.  There is a Python module called [cci lang="python"]random[/cci] that could be used.  Instead I used the Numpy [cci lang="python"]normal[/cci] instead as I can pass in the shape parameter.

Change font size of tick labels

For making figures it is sometimes important (or quite important) to increase the font size of the x or y ticklabels. Here is one way I found to do it:

[cc lang="python"]
fig1 = figure()
for t in gca().get_yticklabels():
t.set_fontsize(14)

fig1.canvas.draw()

[/cc]

For some reason there has to be a [cci lang="python"]fig1.canvas.draw()[/cci] at the end of this to refresh the figure.

Finding coordinates in MRI data volumes

I find myself wanting to run through a list of (x,y,z) coordinates of some data volume (here called “d”) to do some sort of processing on each voxel. What I have come up with is the following…

First, find the set of coordinates that match some criterion. For example, find all coordinates in “d” that are greater than the 70th percentile:
[cc lang="python"]coords = array( nonzero( d > prctile( d, 70 ) ) ).transpose()[/cc]
Now that we have the list of coordinates, we can run through each coordinate and do some sort of processing on it:
[cc lang="python"]
for ii,coord in enumerate( coords ):
r = coord[0]
c = coord[1]

# more stuff here
[/cc]
Obviously if you are using a 3 dimensional volume “d” then you would use:
[cc lang="python"]
s = coord[0]
r = coord[2]
c = coord[2]
[/cc]

Project Euler

A very interesting Math type website is Project Euler. There are over 250 mathematical problems to solve in varying degrees of difficulty. The basic idea is to attempt to solve the problem using snippets of code such that the run time is less than 1 minute.

I have used this website to learn Python and have had great fun figuring out different ways of solving the problems. I can’t say all mine have completed in less than a minute, but getting there.

Python code for reading in Varian FDF files

Below is a Python class that will read in a Varian FDF file, or a Varian “.img” directory (which contains the FDF files). I have used this in the past, but can’t make any claims about it. I offer it up in hopes it is useful to someone.

[cc lang="python"]
import os
import re
from numpy import *
import struct

class Varian:

def __init__(self):
pass

def read( self, filename ):
if filename.endswith(‘.fdf’):
data = self.readFDF( filename )
elif filename.endswith(‘.img’):
data = self.readIMG( filename )
else:
print “Unknown filename %s ” % (filename)

return data

def readFDF(self, filename ):

fp = open( filename, ‘rb’ )

xsize = -1
ysize = -1
zsize = 1
bigendian = -1
done = False

while not done :

line = fp.readline()

if( len( line ) >= 1 and line[0] == chr(12) ):
break

if( len( line ) >= 1 and line[0] != chr(12) ):

if( line.find(‘bigendian’) > 0 ):
endian = line.split(‘=’)[-1].rstrip(‘\n; ‘).strip(‘ ‘)

if( line.find(‘echos’) > 0 ):
nechoes = line.split(‘=’)[-1].rstrip(‘\n; ‘).strip(‘ ‘)

if( line.find(‘echo_no’) > 0 ):
echo_no = line.split(‘=’)[-1].rstrip(‘\n; ‘).strip(‘ ‘)

if( line.find(‘nslices’) > 0 ):
nslices = line.split(‘=’)[-1].rstrip(‘\n; ‘).strip(‘ ‘)

if( line.find(‘slice_no’) > 0 ):
sl = line.split(‘=’)[-1].rstrip(‘\n; ‘).strip(‘ ‘)

if( line.find(‘matrix’) > 0 ):
m = re.findall(‘(\d+)’, line.rstrip())

if len(m) == 2:
xsize, ysize = int(m[0]), int(m[1])
elif len(m) == 3:
xsize, ysize, zsize = int(m[0]), int(m[1]), int(m[2])

fp.seek(-xsize*ysize*zsize*4,2)

if bigendian == 1:
fmt = “>%df” % (xsize*ysize*zsize)
else:
fmt = “<%df” % (xsize*ysize*zsize)

data = struct.unpack(fmt, fp.read(xsize*ysize*zsize*4))
data = array( data ).reshape( [xsize, ysize, zsize ] ).squeeze()

fp.close()

return data

def readIMG(self, directory):

# Get a list of all the FDF files in the directory
try:
files = os.listdir(directory)
except:
print “Could not find the directory %s” % directory
return

files = [ file for file in files if file.endswith('.fdf') ]

data = []
for file in files:
data.append( self.readFDF( directory+’/'+file ) )

data = transpose( array( data ), (1,2,0) )

return data

[/cc]

Python Dicom

There is a great Python package pydicom that implements a nice interface in order to be able to access data within Dicom files.

One application which I wrote up was a dicom directory summarizer which goes through a list of dicom files and summarizes the types of MRI data in the directory.  I found myself getting frustrated trying to figure out which series of data was which given the huge number of dicom files (with really long names too!) in a directory.

The code below may be run within a Dicom directory and should run on Siemens Dicom data (IMA) files. It has been a while that I have run it so I can’t guarantee that it will work, but it should be a good place to start.

[cc lang="python"]
#! /usr/bin/python

import dicom
import os
import re

def blah(val):
return re.compile(‘[\-\w]+\.MR\.[\-\w]+\.\d+\.1\..*’).match(val, 1)

# Get a list of all the files
files = []
for entry in os.listdir(‘.’):
if ~os.path.isdir(entry) & entry.endswith(‘IMA’):
files.append(entry)

# Filter to find the first of each series
firsts = filter( blah, files )

firsts.sort(key=lambda s: int( re.compile(‘[\-\w]+\.MR\.[\-\w]+\.(\d+)\.1\..*’).search(s).group(1)) )

# Read the first and output some interesting stuff
d = dicom.ReadFile(firsts[1])
print ” Patient: ” + d.PatientsName
print “Acquired: ” + d.StudyDate[0:4]+”-”+d.StudyDate[4:6]+”-”+d.StudyDate[6:8] \
+ ” ” + d.StudyTime[0:2] + “:” + d.StudyTime[2:4] + “:” + d.StudyTime[4:6]
print “Comments: ” + d.ImageComments

# Run through the first file of each of the series
for entry in firsts:
d = dicom.ReadFile(entry)

num = re.compile(‘[\-\w]+\.MR\.[\-\w]+\.(\d+)\.1\..*’).search(entry).group(1)

out = “\t” + str(num) + “) ” + d.SeriesDescription

tt = ‘[_\-\w]+\.MR\.[_\-\w]+\.’+str(num)+’\..*’
count = 0
r = re.compile(tt)
for f in files:
if( r.match(f, 1) ):
#print “%s matches %d” % (f, ii)
count = count + 1

if( not re.compile(“.*(FA|TRACEW|TENSOR|ADC|MoCoSeries)$”).match(d.SeriesDescription, 1 ) ):
out += ” (vols=” + str(count)

if( ‘RepetitionTime’ in d ):
out += “, TR=” + str(d.RepetitionTime)

if( ‘EchoTime’ in d ):
out += “, TE=” + str(d.EchoTime)

out += “)”

print out

[/cc]

Progressbar

Much of my MR research is quite computationally expensive so there are many times that I have been sitting wondering how many times through a certain loop I have been.  Enter progressbar.  It is a nice small package which allows me to see, quite nicely, where I am in my loop.

Here is an example of what I typically do:

[cc lang="python"]

from progressbar import ProgressBar, Percentage, Bar, ETA

coords = array( numpy.nonzero( _data[-1] > thresh ) ).transpose()

pbar = ProgressBar(widgets=['Calc Offset Map ', Percentage(), Bar(), ETA()], maxval=coords.shape[0]).start()

for ii,coord in enumerate(coords):

# some big calculation here
pbar.update(ii)

pbar.finish()

[/cc]

This gives me a really nice, informative and pleasing text progressbar.