Extracting data from messy text file

There is a header followed by row names. I want to extract the numeric data for Time, and Area and Volume then group them together into a convenient format for analysis. I’ve tried textscan, sscanf. I haven’t regexp because I’ve never used it before!


It’s just a repetitive application of textscan…

fmt1='Time       [T] %f';
fmt2='Area [V] %f %f %f Volume [V] %f %f %f';
% read first set as has unique number header lines
time=cell2mat(textscan(fid, fmt1,'headerlines',10)); % 1st time value
data=cell2mat(textscan(fid, fmt2, ...
% and second also has unique number to skip...
time=[time; cell2mat(textscan(fid, fmt1,'headerlines',5))];
data=[data; cell2mat(textscan(fid, fmt2, 'headerlines',3, ...
while ~feof(fid)
time=[time; cell2mat(textscan(fid, fmt1,'headerlines',7))];
data=[data; cell2mat(textscan(fid, fmt2, 'headerlines',3, ...

At the end you’ll have a Nx1 vector of time and Nx6 of volumes and areas. You could either concatenate time and data into one array or separate out A and V based on the columns in data; your choice.

At the command line the above gives me




