Usage

Functionality

The curve fitter can be used for smoothing the data.

class gfapy.curve_fit.curve_fitter

Bases: object

A class for fitting data to various mathematical models such as Polynomial, Exponential, Power, Logarithmic, Fourier, Gaussian, Weibull, Hill-type, and Sigmoidal models. It also allows custom model expressions.

__init__()

Initializes the CurveFitter object by defining a set of mathematical models and their variations. The user can select from these models or provide a custom function.

curr_model_name

Name of model to be chosen by user using choose_model.

model

Model to be chosen by user using choose_model.

current_stats

Dictionary of details about fit populated . You can retrieve the fitted and derivative function by using the methods get_fitmodel and get_diffmodel

ingest_data(data, x_col)

Ingests the dataset to be used for curve fitting.

Parameters:
  • data (pd.DataFrame) – The dataset containing the independent and dependent variables.

  • x_col (str) – The column name representing the independent variable (x).

choose_model(model_name)

Selects a predefined model or allows the user to choose a custom model. If the model has suboptions, choose_submodel needs to be run.

Parameters:

model_name (str) – The name of the model to choose from the predefined set of models or ‘Custom’.

choose_submodel(submodel_name)

Selects a submodel from the chosen model if applicable.

Parameters:

submodel_name (str) – The name of the submodel to select.

custom_func(model_string)

Defines a custom model expression when the ‘Custom’ model option is selected.

Parameters:

model_string (str) – A string representing the custom mathematical model in Python syntax.

calc_diff(fitted_params)

Calculates the derivative of the fitted model with respect to the independent variable.

Parameters:

fitted_params (dict) – The parameters of the fitted model.

Returns:

The derivative values of the fitted model for the data’s x-values.

Return type:

np.ndarray

get_diffmodel(fitted_params)

Retrieves the symbolic derivative of the model.

Parameters:

fitted_params (dict) – The parameters of the fitted model.

Returns:

A function that computes the derivative values.

Return type:

callable

get_fitmodel(fitted_params)

Retrieves the symbolic model after substituting the fitted parameters.

Parameters:

fitted_params (dict) – The parameters of the fitted model.

Returns:

A function representing the fitted model.

Return type:

callable

fit(y_col, start_params=None, **kwargs)

Fits the chosen model to the dataset.

Parameters:
  • y_col (str) – The column name representing the dependent variable (y).

  • start_params (list or None) – (optional) Initial guesses for the model parameters. If None, default guesses are used.

  • kwargs – Additional keyword arguments to pass to the curve fitting function.

Returns:

A tuple containing the fitted parameters, fitted y-values, the statistical results, and the plot axis.

Return type:

tuple

fit_jupyter(y_col)

Fits the chosen model to the dataset and displays the fitted results interactively in a Jupyter notebook environment.

Parameters:

y_col (str) – The column name representing the dependent variable (y).

class gfapy.igfa_fitflux.res_plotter(results, multi_result=False, met_meta=None, rxn_meta=None, name=None, time_col_name='Time (WD)')

Bases: res_plotter_generic

A class used to plot and analyze flux perturbations based on model results.

Inherits from res_plotter_generic.

__init__(results, multi_result=False, met_meta=None, rxn_meta=None, name=None, time_col_name='Time (WD)')

Initialize the res_plotter object with results data and metadata.

Parameters:
  • results (pd.DataFrame or dict) – results data from the model.

  • multi_result (bool) – whether the results contain multiple datasets (default False).

  • met_meta (dict or None) – metadata for metabolites (default None).

  • rxn_meta (dict or None) – metadata for reactions (default None).

  • name (str or None) – the name of the dataset (default None).

  • time_col_name (str or None) – name of the column containing time data (default ‘Time (WD)’).

fetch_nominal_vars(result)

Extract and organize the nominal fluxes from the result dataset.

Parameters:

result (dict) – containing ‘secreted_flux’, ‘internal_flux’, and ‘entry_flux’ data.

Returns:

organized nominal flux data for secreted, internal, and entry fluxes.

Return type:

dict

plot_secretedflux(time_col, meas_cols=None, ncols=5, figsize=None, res_ids=None, orig_kwargs=None, smooth_kwargs=None)

Plot secreted flux data over time, comparing measured and predicted values.

Parameters:
  • time_col (str) – name of the column containing time data.

  • meas_cols (list of str or None) – list of columns to be plotted (default None).

  • ncols (int) – number of columns in the plot grid (default 5).

  • figsize (tuple or None) – size of the figure (default None).

  • res_ids (list or None) – result IDs for predicted fluxes (default None).

  • orig_kwargs (dict or None) – additional keyword arguments for the measured data plot (default None).

  • smooth_kwargs (dict or None) – additional keyword arguments for the predicted data plot (default None).

Returns:

Matplotlib figure, the plotted figure.

Return type:

ignore

static get_pertresults(instance, time_col_name)

Extract perturbation results from a Pyomo model instance.

Parameters:
  • instance (pyo.ConcreteModel) – the instance of the model containing perturbation results.

  • time_col_name (str) – name of the column containing time data.

Returns:

perturbation results for secreted flux, internal flux, and other variables.

Return type:

dict

perturb_alpha(model, perc_keys, perc_val=0.1, nominal_result=None, solver_opts=None, get_fullresults=False)

Perform perturbation analysis on alpha parameters in the model.

Parameters:
  • model (pyo.ConcreteModel) – the model used for perturbation analysis.

  • perc_keys (list of str) – the keys (enzymes) to perturb.

  • perc_val (float) – the perturbation percentage (default 0.1).

  • nominal_result (dict or None) – nominal results to use for comparison (default None).

  • solver_opts (dict or None) – solver options for Pyomo (default None).

  • get_fullresults (bool) – whether to return the full perturbation results (default False).

Returns:

depending on the value of get_fullresults, returns normalized secreted flux and internal flux perturbations.

Return type:

tuple or dict

perturb_vref(model, perc_keys, perc_val=0.1, nominal_result=None, solver_opts=None, get_fullresults=False)

Perform perturbation analysis on reference velocities (vref) in the model.

Parameters:
  • model (pyo.ConcreteModel) – the model used for perturbation analysis.

  • perc_keys (list of str) – the keys (reactions) to perturb.

  • perc_val (float) – the perturbation percentage (default 0.1).

  • nominal_result (dict or None) – nominal results to use for comparison (default None).

  • solver_opts (dict or None) – solver options for Pyomo (default None).

  • get_fullresults (bool) – whether to return the full perturbation results (default False).

Returns:

depending on the value of get_fullresults, returns normalized secreted flux and internal flux perturbations.

Return type:

tuple or dict

plot_perturb(perturb_data, x_col, time_col, meas_cols=None, ncols=5, figsize=None, **plot_kwargs)

Plot perturbation data using a bar plot, with optional customization.

Parameters:
  • perturb_data (pd.DataFrame) – perturbation results to be plotted.

  • x_col (str) – Column to be used as the x-axis (usually enzyme or reaction name).

  • time_col (str) – Column indicating time points.

  • meas_cols (list of str or None) – List of columns to be plotted (default None).

  • ncols (int) – Number of columns in the plot grid (default 5).

  • figsize (tuple or None) – Size of the figure (default None).

  • plot_kwargs (dict) – Additional keyword arguments for the plot.

Returns:

The plotted figure.

Return type:

ignore

choose_top_n(obj_col, n=10, ascending=True)

Chooses the top N results based on the specified objective column.

Parameters:
  • obj_col (str) – The column name for objectives to sort by.

  • n (int) – The number of top results to return.

  • ascending (bool) – Whether to sort in ascending order.

Returns:

A list of the top N result indices.

Return type:

list

get_params_n(n_res, param_col)

Retrieves parameters for a given list of results.

Parameters:
  • n_res (list) – A list of result indices.

  • param_col (str) – The column name of the parameters to retrieve.

Returns:

A DataFrame of the specified parameters.

Return type:

pd.DataFrame

plot_alphas(time_col, meas_cols=None, ncols=5, figsize=None, res_ids=None, **plot_kwargs)

Plots the alpha values over time.

Parameters:
  • time_col (str) – The name of the time column.

  • meas_cols (list, optional) – The measurement columns to plot.

  • ncols (int) – The number of columns in the plot grid.

  • figsize (tuple, optional) – The size of the figure.

  • res_ids (list, optional) – The result IDs to plot.

  • plot_kwargs – Additional keyword arguments for plotting.

Returns:

The created figure.

Return type:

plt.Figure

plot_betas(time_col, meas_cols=None, ncols=5, figsize=None, res_ids=None, orig_kwargs=None, smooth_kwargs=None)

Plot beta values over time.

Parameters:
  • time_col – The name of the time column.

  • meas_cols – The columns to measure, defaults to None (uses all columns).

  • ncols – Number of columns in the subplot layout, defaults to 5.

  • figsize – The size of the figure, defaults to None (auto-calculated).

  • res_ids – List of resource IDs, defaults to the current index if None.

  • orig_kwargs – Additional keyword arguments for the original plot.

  • smooth_kwargs – Additional keyword arguments for the smooth plot.

Return type:

plt.Figure

Returns:

The figure containing the plots.

plot_entry_flux(time_col='Time (WD)', meas_cols=None, ncols=5, figsize=None, res_ids=None, **plot_kwargs)

Plot entry flux values over time.

Parameters:
  • time_col – The name of the time column, defaults to ‘Time (WD)’.

  • meas_cols – The columns to measure, defaults to None (uses all columns).

  • ncols – Number of columns in the subplot layout, defaults to 5.

  • figsize – The size of the figure, defaults to None (auto-calculated).

  • res_ids – List of resource IDs, defaults to the current index if None.

  • plot_kwargs – Additional keyword arguments for the plot.

Return type:

plt.Figure

Returns:

The figure containing the plots.

plot_fracs(time_col, meas_cols=None, ncols=5, figsize=None, res_ids=None, orig_kwargs=None, smooth_kwargs=None)

Plot fraction values over time.

Parameters:
  • time_col – The name of the time column.

  • meas_cols – The columns to measure, defaults to None (uses all columns).

  • ncols – Number of columns in the subplot layout, defaults to 5.

  • figsize – The size of the figure, defaults to None (auto-calculated).

  • res_ids – List of resource IDs, defaults to the current index if None.

  • orig_kwargs – Additional keyword arguments for the original plot.

  • smooth_kwargs – Additional keyword arguments for the smooth plot.

Return type:

plt.Figure

Returns:

The figure containing the plots.

plot_gamma(time_col='Time (WD)', meas_cols=None, ncols=5, figsize=None, res_ids=None, **plot_kwargs)

Plot gamma values over time.

Parameters:
  • time_col – The name of the time column, defaults to ‘Time (WD)’.

  • meas_cols – The columns to measure, defaults to None (uses all columns).

  • ncols – Number of columns in the subplot layout, defaults to 5.

  • figsize – The size of the figure, defaults to None (auto-calculated).

  • res_ids – List of resource IDs, defaults to the current index if None.

  • plot_kwargs – Additional keyword arguments for the plot.

Return type:

plt.Figure

Returns:

The figure containing the plots.

plot_interactive(init_model, port=5000)

Launches an interactive Dash application for glycosylation flux analysis.

Parameters:
  • init_model (ModelType # Replace with the actual type of init_model.) – The initial model containing the necessary data for plotting.

  • port (int) – The port on which the Dash application will run. Default is 5000.

Returns:

A list containing the nodes and compartments for the Cytoscape graph.

Return type:

list

plot_internalflux(time_col, meas_cols=None, ncols=5, figsize=None, res_ids=None, **plot_kwargs)

Plot internal flux values over time.

Parameters:
  • time_col – The name of the time column.

  • meas_cols – The columns to measure, defaults to None (uses all columns).

  • ncols – Number of columns in the subplot layout, defaults to 5.

  • figsize – The size of the figure, defaults to None (auto-calculated).

  • res_ids – List of resource IDs, defaults to the current index if None.

  • plot_kwargs – Additional keyword arguments for the plot.

Return type:

plt.Figure

Returns:

The figure containing the plots.

plot_vref(x_col='Reactions', meas_cols=None, ncols=5, figsize=None, res_ids=None, **plot_kwargs)

Plot reference flux values.

Parameters:
  • x_col – The name of the x-axis column, defaults to ‘Reactions’.

  • meas_cols – The columns to measure, defaults to None (uses all columns).

  • ncols – Number of columns in the subplot layout, defaults to 5.

  • figsize – The size of the figure, defaults to None (auto-calculated).

  • res_ids – List of resource IDs, defaults to the current index if None.

  • plot_kwargs – Additional keyword arguments for the plot.

Return type:

plt.Figure

Returns:

The figure containing the plots.

sel_compartment(comp=1)

Selects the results for a specific compartment.

Parameters:

comp (int) – The compartment index to select.

Return type:

None

sel_result(res_index='Main')

Selects a specific result by index.

Parameters:

res_index (str) – The index of the result to select.

Raises:

ValueError – If the index is not found in the results.

Return type:

None

summarize_runs(n_show=100)

Summarizes the simulation runs and generates a styled report. :param n_show: Number of top results to show statistics for. :type n_show: int :return: A styled DataFrame summary of the runs. :rtype: pd.io.formats.style.Styler

class gfapy.igfa_fitflux.parse_model(stoich, feat_meta, met_meta, n_comp, lin_rxns, name=None, author=None)

Bases: parse_model_generic

A class to model and analyze the glycoform production based on the provided stoichiometric and feature metadata.

Attributes:

time_col_name (str): The name of the time column in the data. spec_prod (DataFrame): Specific production data. glyco_flux (DataFrame): Glycoform flux data.

__init__(stoich, feat_meta, met_meta, n_comp, lin_rxns, name=None, author=None)

Initialize the parse_model instance.

Parameters:
  • stoich (DataFrame) – Stoichiometric coefficients.

  • feat_meta (DataFrame) – Feature metadata.

  • met_meta (DataFrame) – Metabolite metadata.

  • n_comp (int) – Number of components in the model.

  • lin_rxns (DataFrame) – Linear reactions.

  • name (str, optional) – Name of the model (optional).

  • author (str, optional) – Author of the model (optional).

read_measurements(data, time_col_name='Time (WD)')

Read measurements from the given data.

met_meta and frac should have same glycoform IDs All keys in data should have timepoints (float/int) as index

Parameters:
  • data (DataFrame) – Data containing measurements.

  • time_col_name (str) – Column name for time data, default is ‘Time (WD)’.

Raises:

ValueError – If ‘met_meta’ and ‘frac’ do not have matching glycoform IDs.

fetch_data(spec_prod_col='Spec Prod (pg/cells/day)')

Fetch and organize relevant data.

Parameters:

spec_prod_col (str) – Column name for Specific Production, default is ‘Spec Prod (pg/cells/day)’.

Returns:

A dictionary containing organized data.

Return type:

dict

create_pyomomodel(fit_beta='fitted', regularize_params=True)

Create and configure the Pyomo model for optimization.

Parameters:
  • fit_beta (str) – Flag indicating whether to fit beta parameters (default is ‘fitted’). Can be ‘fitted’, ‘measured’ or None. If None, betas will not be fitted to secreted flux ratio

  • regularize_params (bool) – Flag indicating whether to apply regularization (default is True).

Returns:

Configured Pyomo model instance.

Return type:

pyo.AbstractModel

get_results(instance)

Retrieve the results from the optimization model instance.

Parameters:

instance (pyo.AbstractModel) – The Pyomo model instance containing optimization results.

Returns:

A dictionary containing the results of the model.

Return type:

dict

create_perturb_model()

Create a perturbation model for analyzing the effects of parameter changes.

Returns:

A Pyomo model for perturbation analysis.

Return type:

pyo.AbstractModel

run_multistart(instance, strategy='rand', iterations=100, suppress_warning=True, solver_options=None)

Runs a multi-start optimization for the given instance.

Parameters:
  • instance (pyo.ConcreteModel) – The Pyomo model instance to be solved.

  • strategy (str, optional) – The strategy for multi-start optimization (default is ‘rand’).

  • iterations (int, optional) – The number of iterations to run (default is 100).

  • suppress_warning (bool, optional) – Flag to suppress warnings (default is True).

  • solver_options (dict[str, any], optional) – Additional solver options to configure.

Returns:

A tuple containing the results dictionary and the status run dictionary.

Return type:

tuple[dict[int, any], dict[str, list[int]]]

run_singlestart(instance, solver_options=None)

Runs a single-start optimization for the given instance.

Parameters:
  • instance (pyo.ConcreteModel) – The Pyomo model instance to be solved.

  • solver_options (dict[str, any], optional) – Additional solver options to configure.

Returns:

A dictionary containing the results of the optimization.

Return type:

dict[str, any]

gfapy.plotter.plot_fig(time_col, data, meas_col, smooth_col=None, ax=None)

Plots a single measurement over time, optionally including a fitted curve.

Parameters:
  • time_col (str) – The column name representing time in the data.

  • data (pd.DataFrame) – The data containing measurements.

  • meas_col (str) – The column name representing the measurement to plot.

  • smooth_col (str, optional) – The column name representing the fitted curve (optional).

  • ax (plt.Axes, optional) – The Axes object to plot on (optional). If None, a new Axes object will be created.

Returns:

The generated figure containing the plot.

Return type:

plt.Figure

gfapy.plotter.plot_meas(time_col, orig_data, smooth_data=None, meas_cols=None, ncols=3, figsize=(5, 5))

Plots measurements over time, optionally including smoothed data.

Parameters:
  • time_col (str) – The column name representing time in the original data.

  • orig_data (pd.DataFrame) – The original data containing measurements.

  • smooth_data (pd.DataFrame, optional) – The smoothed data to be plotted (optional).

  • meas_cols (list, optional) – List of measurement column names to plot (optional). If None, all columns except the time column will be plotted.

  • ncols (int) – The number of columns to arrange the subplots in.

  • figsize (tuple) – The size of the figure.

Returns:

The generated figure containing the plots.

Return type:

plt.Figure

gfapy.plotter.plot_perturb(time_col, x_col, orig_data, meas_cols=None, ncols=3, figsize=(5, 5))

Plots perturbations of measurements over a specified axis.

Parameters:
  • time_col (str) – The column name representing time in the original data.

  • x_col (str) – The column name representing the perturbation variable.

  • orig_data (pd.DataFrame) – The original data containing measurements.

  • meas_cols (list, optional) – List of measurement column names to plot (optional). If None, all columns except the time column will be plotted.

  • ncols (int) – The number of columns to arrange the subplots in.

  • figsize (tuple) – The size of the figure.

Returns:

The generated figure containing the plots.

Return type:

plt.Figure