rfwtools.example.WindowedExample

class rfwtools.example.WindowedExample(zone, dt, cavity_label, fault_label, cavity_conf, fault_conf, label_source, start, n_samples, data_dir=None)[source]

Bases: Example

An extension of Example class that allows the caller to specify only a time-window of event_df be returned.

This window is based on the relative values in the Time column. The standard time range is approximately [-1500, 100], but this is variable depending on control system settings. An Exception is raised at load time if the specified time range is not a strict subset of the Examples Time column.

__init__(zone, dt, cavity_label, fault_label, cavity_conf, fault_conf, label_source, start, n_samples, data_dir=None)[source]

Construct an instance th will only store the required window upon a load_data() call.

Parameters:
  • start (Optional[float]) – The start of the time window.

  • n_samples (int) – The number of samples to include after the start of the window.

Methods

__init__(zone, dt, cavity_label, ...[, data_dir])

Construct an instance th will only store the required window upon a load_data() call.

capture_files_on_disk([compressed])

Checks if captures files are currently saved to disk.

convert_waveform_column_names(columns)

Turns waveform PV names (R1M1WFSGMES) into more uniform name based on cavity and waveform (1_GMES)

get_capture_file_list()

Creates a list of capture file names.

get_event_path([compressed])

Generates the expected location for uncompressed event waveform data.

get_example_type()

Get this Example's ExampleType.

has_matching_labels(example)

Check if the supplied example has the same cavity and fault type label.

is_capture_file(filename)

Validates if filename appears to be a valid capture file.

load_data([verbose])

Load the fault event data according to Example.load_data() and retain only the defined time window.

parse_capture_file(file)

Parses an individual capture file into a Pandas DataFrame object.

parse_event_dir(event_path[, compressed])

Parses the capture files in the BaseModel's event_dir and sets event_df to the appropriate pandas DataFrame.

plot_waveforms([signals, downsample])

Plot the waveform data associated with this example.

remove_event_df_from_disk()

Deletes the 'cached' event waveform data for this event from disk.

save_event_df_to_disk(event_df)

This method is saves the event waveform DataFrame to disk.

to_string()

This provides a more descriptive string than __str__.

unload_data([verbose])

Top-level method for deleting the Examples data (event_df) from memory.

Attributes

capture_file_regex

A regex for matching capture file filenames

e_type

The type of Example this represents.

start

The start of the window relative to the fault onset

n_samples

The number of samples requested after the start value

end

(float) The last Time value in the window.

capture_file_regex = re.compile('R.*harv\\..*\\.txt')

A regex for matching capture file filenames

Type:

(re.Pattern)

capture_files_on_disk(compressed=False)

Checks if captures files are currently saved to disk.

Parameters:

compressed (bool) – Are we checking for compressed file (True), or uncompressed (False, default)?

Return type:

bool

Returns:

True if the compressed file or regular directors were found.

cavity_conf

Cavity label confidence

Type:

(float)

cavity_label

Expert/model provided cavity label

Type:

(str)

static convert_waveform_column_names(columns)

Turns waveform PV names (R1M1WFSGMES) into more uniform name based on cavity and waveform (1_GMES)

Parameters:

columns (List[str]) – List of waveform columns from a single zone, i.e., a list of event waveform names to convert.

Return type:

List[str]

Returns:

The updated/standardized column names sans zone identifier.

data_dir

The directory where the waveform data can be found. None if Config is to be referenced.

Type:

(str)

e_type

The type of Example this represents. Convenient type tracking.

Type:

(ExampleType)

end

(float) The last Time value in the window. Determined after loading data for the first time.

event_datetime

When did the event occur

Type:

(datetime.datetime)

event_df

The DataFrame for holding the actual waveform data

Type:

(pd.DataFrame)

event_zone

Which zone had the event

Type:

(str)

fault_conf

Fault label confidence

Type:

(float)

fault_label

Expert/model provided fault label

Type:

(str)

get_capture_file_list()

Creates a list of capture file names. Typically, this has eight file names.

This replaced a method that reads in file contents and returned a dictionary of names to content. The only internal use case was getting the list of file names, so it was replaced with the simpler method.

Return type:

List[str]

Returns:

A list of capture file names for the Example.

get_event_path(compressed=False)

Generates the expected location for uncompressed event waveform data.

Parameters:

compressed (bool) – Should the returned path be for a compressed (tgz) event

Return type:

str

Returns:

The expected path to uncompressed directory of waveform data.

get_example_type()

Get this Example’s ExampleType.

Return type:

ExampleType

Returns:

The Enum corresponding to the class type

has_matching_labels(example)

Check if the supplied example has the same cavity and fault type label.

Parameters:

example (Example) – A Example object to compare labels against

Return type:

bool

Returns:

True if both cavity and fault labels match. False otherwise.

static is_capture_file(filename)

Validates if filename appears to be a valid capture file.

Parameters:

filename (str) – The name of the file that is to be validated

Returns:

True if the filename appears to be a valid capture file. Otherwise False.

Return type:

bool

label_source

Source of labeles (which model, file, etc.)

Type:

(str)

load_data(verbose=False)[source]

Load the fault event data according to Example.load_data() and retain only the defined time window.

This changes the Time column from being relative to the fault onset to being relative to the start of the window. This means that the first time value is unlikely to be exactly 0, but should be a small positive number.

Parameters:

verbose (bool) – Should extra information be printed during operation

Raises:

RuntimeError – If the specified window of data is not available.

Return type:

None

n_samples

The number of samples requested after the start value

Type:

(int)

static parse_capture_file(file)

Parses an individual capture file into a Pandas DataFrame object.

Reads all data in as float64 dtypes because a column of all integers will default to integers (e.g., all zeroes) :rtype: DataFrame

Args:

file (file): A file like object. Either the string of the filename or a file_like_object

Returns:

DataFrame: A pandas DataFrame containing the data from the specified capture file

static parse_event_dir(event_path, compressed=False)

Parses the capture files in the BaseModel’s event_dir and sets event_df to the appropriate pandas DataFrame.

The waveform names are converted from <EPICS_NAME><Waveform> (e.g., R123WFSGMES), to <Cavity_Number>_<Waveform> (e.g., 3_GMES). This allows analysis code to more easily handle waveforms from different zones.

Parameters:
  • event_path (str) – The path to the event directory or compressed tar.gz file

  • compressed (bool) – Is the data a compressed tar.gz file or a regular directory

Raises:

ValueError – if a column name is discovered with an unexpected format

Return type:

None

plot_waveforms(signals=None, downsample=32)

Plot the waveform data associated with this example. Optionally down sample the signals.

Parameters:
  • signals (Optional[List[str]]) – A list of signal names to plot, e.g. ‘1_GMES’. If None, then GMES, DETA2, GASK, CRFP, and PMES will be plotted for all cavities

  • downsample (int) – The down sampling factor, i.e., keep every <downsample>-th point. By default keep every 32nd point

Return type:

None

remove_event_df_from_disk()

Deletes the ‘cached’ event waveform data for this event from disk. Both compressed and uncompressed data.

Return type:

None

save_event_df_to_disk(event_df)

This method is saves the event waveform DataFrame to disk. Can provide faster access to ‘raw’ data later.

If capture files already exist, it won’t try to overwrite them. Does nothing if event_path is None. Note that every capture file will end up with the same timestamp as self.event_datetime.

Parameters:

event_df (DataFrame) – The DataFrame for which we should create a fault event directory of capture files.

Return type:

None

start

The start of the window relative to the fault onset

Type:

(float)

to_string()

This provides a more descriptive string than __str__.

Return type:

str

Returns:

A string representation of the example including zone, time, label info, and label source.

unload_data(verbose=False)

Top-level method for deleting the Examples data (event_df) from memory.

Parameters:

verbose (bool) – Should extra information be printed to STDOUT

Return type:

None