{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Going beyond builtin\n",
    "\n",
    "The `pyglotaran-extras` are a utility library enabling users to quickly inspect and visualize \n",
    "results from `pyglotaran` in the most common ways we know of.\n",
    "\n",
    "However since specific needs can vary a lot on a case by case basis and we can't possibly \n",
    "anticipate all user needs. \n",
    "\n",
    "Thus it is important that you as a user are familiar with the usage of the underlying libraries \n",
    "that the `pyglotaran-extras` package uses to facilitate its functionality and be able to help yourself.\n",
    "\n",
    "> Giving a user a plot will fit their needs for this case, teaching a user how to create their own \n",
    "> plots will help them with all their needs.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basics of working with `xarray`\n",
    "\n",
    "The `xarray` library is the backbone of how `pyglotaran` stores result data, which is why it is \n",
    "important to know how to work with it.\n",
    "\n",
    "Let's start by creating an example `Result` by utilizing the simulation capabilities of `pyglotaran`\n",
    "and the included example test data.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from glotaran.testing.simulated_data.parallel_spectral_decay import SCHEME\n",
    "from glotaran.optimization.optimize import optimize\n",
    "\n",
    "result = optimize(SCHEME)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inspecting result data\n",
    "\n",
    "Before we can select data we first need to know which data we actually have.\n",
    "\n",
    "So let's have a look at the `data` attribute of our example `Result`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result.data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From the first look we can see that the `Result` only contains a single dataset named `dataset_1`.\n",
    "\n",
    "For ease of use let's assign it to a variable `ds` and have a closer look."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = result.data[\"dataset_1\"]\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Accessing data inside of a dataset\n",
    "\n",
    "When looking at the `Data Variables` we can see that we have variable called `fitted_data` which are 2D data\n",
    "with the dimensions `time` and `spectral`.\n",
    "\n",
    "We can access this data variable in 3 ways:\n",
    "- Attribute style accessing with `ds.fitted_data`\n",
    "- Dict like accessing with `ds[\"fitted_data\"]`\n",
    "- Via `data_vars` using `ds.data_vars[\"fitted_data\"]`\n",
    "\n",
    "Those three ways to access `fitted_data` are equivalent and give you the same data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f'{ds.fitted_data.equals(ds[\"fitted_data\"])=}')\n",
    "print(f'{ds.fitted_data.equals(ds.data_vars[\"fitted_data\"])=}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Basic plotting\n",
    "\n",
    "Now that we know how to access the data we are interested in, let's plot them.\n",
    "\n",
    "Lucky for us `xarray` comes with built in convenience functionality that lets us quickly have a look at the data. \n",
    "\n",
    "For data with up to two dimensions `xarray` is pretty good guessing what we want to plot by simply \n",
    "calling the `plot` attribute on our data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.fitted_data.plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{note}\n",
    "We added the `;` at the end so the underlying structure of the python object which `.plot()` returns won't distract us.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we rather want `time` to be on the x-axis we can simply tell `xarray` so. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.fitted_data.plot(x=\"time\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While a 2D plot is pretty to look at, it often doesn't provide us with enough detail and we would \n",
    "rather see multiple lines plotted.\n",
    "\n",
    "This can easily be achieved by telling `xarray` which kind of plot we want rather than letting it \n",
    "guess based on the dimensionality of our data.\n",
    "\n",
    "Since we want to plot lines we will use `.line` method on the `plot` attribute.\n",
    "\n",
    "```{note}\n",
    "Since we have 2D data it is now required to tell `xarray` over which dimension we want to plot\n",
    "so it can create a separate line of each data point along the other dimension. \n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.fitted_data.plot.line(x=\"time\", add_legend=False);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even though we now have a line plot this isn't what we wanted because each point on the `spectral` \n",
    "dimension resulted in its own line, leaving us with a plot that contains 72 lines."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data selection"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To reduce the number of lines we need to select a subset of our data based on the values of a dimension.\n",
    "\n",
    "The `xarray` library provides two main ways to select data:\n",
    "- **`.sel()`** - Select by dimension label values (e.g., select a specific wavelength)\n",
    "- **`.isel()`** - Select by dimension index (e.g., select the 5th time point)\n",
    "\n",
    "Let's start with selecting a single wavelength from our fitted data using `.sel()`.\n",
    "\n",
    "Since we have discrete wavelength values, we need to use the `method=\"nearest\"` parameter to find \n",
    "the closest match to our desired value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.fitted_data.sel(spectral=0, method=\"nearest\").plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Much better! Now we have a single trace showing how the fitted data changes over time at a specific \n",
    "wavelength.\n",
    "\n",
    "We can also select multiple values at once by passing a list. This is particularly useful when \n",
    "working with categorical dimensions like `species`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.sel(species=[\"species_1\", \"species_2\"]).species_associated_spectra.plot.line(x=\"spectral\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can clearly see the individual species associated spectra for the selected species.\n",
    "\n",
    "When you want to select data by position rather than by label, use `.isel()` (index select).\n",
    "\n",
    "This is especially useful when working with slices to select ranges of data. For example, let's \n",
    "plot a subset of our IRF data from time index 80 to 200."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.isel(time=slice(80, 200)).irf.plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Modifying coordinates\n",
    "\n",
    "Sometimes you need to transform the coordinate values themselves rather than just selecting subsets \n",
    "of data. \n",
    "\n",
    "A common use case is shifting the time axis so that time zero corresponds to the IRF location.\n",
    "\n",
    "The `pyglotaran-extras` package provides helper functions like `extract_irf_location()` to make \n",
    "this easier. Let's extract the IRF location and shift the time coordinates accordingly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyglotaran_extras.plotting.utils import extract_irf_location\n",
    "\n",
    "irf_location = extract_irf_location(ds)\n",
    "ds_shifted = ds.copy()\n",
    "ds_shifted[\"time\"] = ds.time - irf_location\n",
    "ds_shifted.isel(time=slice(80, 200)).irf.plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By combining `.sel()`, `.isel()` and the various plotting methods, you can create customized \n",
    "visualizations that exactly fit your analysis needs.\n",
    "\n",
    "```{tip}\n",
    "You can chain selections together: `ds.sel(species=\"species_1\").isel(time=slice(0, 100))` \n",
    "to first select by label and then by index.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Working with cyclers\n",
    "\n",
    "One of the most common plot customizations besides data selection is changing the plot style.\n",
    "\n",
    "```{note}\n",
    "For more information on how to use `cycler` have a look at the [`matplotlib` documentation](https://matplotlib.org/cycler/).\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cycler import cycler\n",
    "from pyglotaran_extras.plotting.style import PlotStyle\n",
    "from pyglotaran_extras.inspect import inspect_cycler"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inspecting the default cycler\n",
    "\n",
    "The `pyglotaran-extras` package comes with a built-in `PlotStyle` that defines a default `cycler` used by all plotting functions.\n",
    "\n",
    "The `inspect_cycler` function lets us visualize the properties of a cycler as a table, including a small preview of each line style. Let's have a look at the default one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inspect_cycler(PlotStyle().cycler)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's quite a lot of entries! Let's check exactly how many styles are defined in the default cycler."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "len(PlotStyle().cycler)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Slicing a cycler\n",
    "\n",
    "For many plots we only need a handful of styles. Just like a Python list, a `Cycler` can be sliced\n",
    "to create a smaller subset. Let's create a `small_cycler` containing only the first 3 entries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "small_cycler = PlotStyle().cycler[:3]\n",
    "inspect_cycler(small_cycler)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Repeating a cycler\n",
    "\n",
    "If you need the same set of styles to repeat, you can multiply a cycler by an integer.\n",
    "This simply concatenates the cycler with itself the given number of times."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inspect_cycler(small_cycler * 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Adding properties with `+`\n",
    "\n",
    "The `+` operator performs an **element-wise** combination of two cyclers of equal length.\n",
    "This is useful when you want to add a new property (e.g. `linestyle`) to an existing cycler.\n",
    "\n",
    "Since element-wise addition requires both cyclers to have the same length, we multiply the\n",
    "single-entry linestyle cycler by `len(small_cycler)` to match sizes first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inspect_cycler(small_cycler + cycler(linestyle=[\":\"]) * len(small_cycler))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Combining cyclers with `*` (outer product)\n",
    "\n",
    "The `*` operator between two cyclers creates an **outer product**, generating all possible\n",
    "combinations of both cyclers' entries.\n",
    "\n",
    "When one of the cyclers has a single entry, the result simply applies that property to every entry\n",
    "of the other cycler — similar to using `+`, but without needing to match lengths manually."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inspect_cycler(small_cycler * cycler(linestyle=[\":\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When the second cycler has **multiple entries**, the outer product creates all combinations — \n",
    "resulting in `len(a) × len(b)` total entries. Here our 3-entry color cycler combined with a\n",
    "2-entry linestyle cycler gives us 6 distinct styles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inspect_cycler(small_cycler * cycler(linestyle=[\"-\", \":\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{important}\n",
    "Same as in math the **outer product** isn't commutative, so the order matters!\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inspect_cycler(cycler(linestyle=[\"-\", \":\"]) * small_cycler)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compose your own plotting function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we know how to work with `xarray` data and customize plot styles with cyclers, let's put it all together.\n",
    "\n",
    "The `pyglotaran-extras` package is designed to be **composable**, \n",
    "its individual plot functions like `plot_concentrations`, `plot_sas`, `plot_svd`, `plot_residual`, etc. \n",
    "are all building blocks that can be freely reused and rearranged to create exactly the visualization you need.\n",
    "\n",
    "Instead of being limited to the built-in overview plots, you can:\n",
    "- Pick only the specific plot functions relevant to your analysis\n",
    "- Arrange them in any layout using `matplotlib`'s subplot system\n",
    "- Pass in a custom `cycler` to keep a consistent style across all panels\n",
    "- Apply additional `matplotlib` customizations on top\n",
    "\n",
    "Let's create a custom plotting function that combines concentration and spectra plots side by side, \n",
    "using a custom cycler we build from what we learned above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cycler import cycler\n",
    "from pyglotaran_extras import plot_concentrations\n",
    "from pyglotaran_extras import plot_sas, add_subplot_labels\n",
    "from pyglotaran_extras.plotting.style import PlotStyle\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "custom_cycler = PlotStyle().cycler[:3] + cycler(linestyle=[\"-\", \":\", \"--\"])\n",
    "\n",
    "\n",
    "def plot_concentration_and_spectra(result_dataset):\n",
    "    fig, axes = plt.subplots(1, 2, figsize=(15, 4))\n",
    "    plot_concentrations(result_dataset, axes[0], center_λ=0, linlog=True, cycler=custom_cycler)\n",
    "    plot_sas(result_dataset, axes[1], cycler=custom_cycler)\n",
    "    return fig, axes\n",
    "\n",
    "\n",
    "fig, axes = plot_concentration_and_spectra(ds.isel(time=slice(100, None)))\n",
    "axes[0].set_xlabel(\"Time (ps)\")\n",
    "axes[0].set_ylabel(\"\")\n",
    "axes[0].axhline(0, color=\"k\", linewidth=0.5)\n",
    "axes[1].set_xlabel(\"Wavelength (nm)\")\n",
    "axes[1].set_ylabel(\"SADS (OD)\")\n",
    "axes[1].set_title(\"SADS\")\n",
    "axes[1].axhline(0, color=\"k\", linewidth=0.5)\n",
    "add_subplot_labels(axes, label_format_function=\"lower_case_letter\", label_format_template=\"{})\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```{tip}\n",
    "To reduce repetition check out the [documentation on using plot config](../config/project/subproject/config_docs.ipynb) and \n",
    "how to use it for your own plot functions ([`use_plot_config`](../../api/pyglotaran_extras/pyglotaran_extras.config.plot_config.html#pyglotaran_extras.config.plot_config.use_plot_config)).\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.19"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}