Track notebooks, scripts & functions¶

For tracking pipelines, see: Pipelines – workflow managers.

# pip install 'lamindb[jupyter]'
!lamin init --storage ./test-track

Track a notebook or script¶

Call track() to register your notebook or script as a transform and start capturing inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code automatically tracks inputs & outputs

ln.finish()  # mark run as finished, save execution report, source code & environment

You find your notebooks and scripts in the Transform registry (along with pipelines & functions). Run stores executions. You can use all usual ways of querying to obtain one or multiple transform records, e.g.:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code  # source code
transform.runs  # all runs
transform.latest_run.report  # report of latest run
transform.latest_run.environment  # environment of latest run

To load a notebook or script from the hub, search or filter the transform page and use the CLI.

lamin load https://lamin.ai/laminlabs/lamindata/transform/13VINnFk89PE

Use projects¶

You can link the entities created during a run to a project.

import lamindb as ln

my_project = ln.Project(name="My project").save()  # create a project

ln.track(project="My project")  # auto-link entities to "My project"

ln.Artifact(ln.core.datasets.file_fcs(), key="my_file.fcs").save()  # save an artifact

Filter entities by project, e.g., artifacts:

ln.Artifact.filter(projects=my_project).df()

Show code cell output Hide code cell output

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
1	ma8gmwBVjzvO6xmo0000	my_file.fcs	None	.fcs	None	None	19330507	rCPvmZB19xs4zHZ7p_-Wrg	None	None	md5	True	False	1	1	None	None	True	1	2025-06-03 11:32:59.134000+00:00	1	None	1

Access entities linked to a project.

display(my_project.artifacts.df())
display(my_project.transforms.df())
display(my_project.runs.df())

Show code cell output Hide code cell output

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
1	ma8gmwBVjzvO6xmo0000	my_file.fcs	None	.fcs	None	None	19330507	rCPvmZB19xs4zHZ7p_-Wrg	None	None	md5	True	False	1	1	None	None	True	1	2025-06-03 11:32:59.134000+00:00	1	None	1

	uid	key	description	type	source_code	hash	reference	reference_type	space_id	_template_id	version	is_latest	created_at	created_by_id	_aux	branch_id
id
1	lPGHfnTi6kow0000	track.ipynb	Track notebooks, scripts & functions	notebook	None	None	None	None	1	None	None	True	2025-06-03 11:32:56.695000+00:00	1	None	1

	uid	name	started_at	finished_at	reference	reference_type	_is_consecutive	_status_code	space_id	transform_id	report_id	_logfile_id	environment_id	initiated_by_run_id	created_at	created_by_id	_aux	branch_id
id
1	3uaCCcz7qchuDKC6	None	2025-06-03 11:32:56.706805+00:00	None	None	None	None	0	1	1	None	None	None	None	2025-06-03 11:32:56.707000+00:00	1	None	1

Use spaces¶

You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.

ln.track(space="Our team space")

Track parameters¶

In addition to tracking source code, run reports & environments, you can track run parameters.

Track run parameters¶

First, define valid parameters, e.g.:

ln.Feature(name="input_dir", dtype=str).save()
ln.Feature(name="learning_rate", dtype=float).save()
ln.Feature(name="preprocess_params", dtype="dict").save()

If you hadn’t defined these parameters, you’d get a ValidationError in the following script.

run_track_with_params.py¶

import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,  # nested parameter names & values in dictionaries are not validated
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run_track_with_params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample

Query by run parameters¶

Query for all runs that match a certain parameters:

ln.Run.filter(
    learning_rate=0.01, input_dir="./mydataset", preprocess_params__downsample=True
).df()

Show code cell output Hide code cell output

	uid	name	started_at	finished_at	reference	reference_type	_is_consecutive	_status_code	space_id	transform_id	report_id	_logfile_id	environment_id	initiated_by_run_id	created_at	created_by_id	_aux	branch_id
id
2	2feNkI5AUb7w3Be4	None	2025-06-03 11:33:02.023664+00:00	2025-06-03 11:33:03.211575+00:00	None	None	True	0	1	2	3	None	2	None	2025-06-03 11:33:02.024000+00:00	1	None	1

Note that:

preprocess_params__downsample=True traverses the dictionary preprocess_params to find the key "downsample" and match it to True
nested keys like "downsample" in a dictionary do not appear in Feature and hence, do not get validated

Access parameters of a run¶

Below is how you get the parameter values that were used for a given run.

run = ln.Run.filter(learning_rate=0.01).order_by("-started_at").first()
run.features.get_values()

Explore parameter values¶

If you want to query all parameter values together with other feature values, use FeatureValue.

ln.models.FeatureValue.df(include=["feature__name", "created_by__handle"])

Show code cell output Hide code cell output

	value	hash	feature__name	created_by__handle
id
1	./mydataset	71I4KdtOlqWZYoR9KaVTvw	input_dir	testuser1
2	0.01	BIF-_RHBU2Sm7COXgAOIYg	learning_rate	testuser1
3	{'downsample': True, 'normalization': 'the_goo...	4ehQH8UO25aNM181K_gloQ	preprocess_params	testuser1

Track functions¶

If you want more-fined-grained data lineage tracking, use the tracked() decorator.

In a notebook¶

ln.Feature(name="subset_rows", dtype="int").save()  # define parameters
ln.Feature(name="subset_cols", dtype="int").save()
ln.Feature(name="input_artifact_key", dtype="str").save()
ln.Feature(name="output_artifact_key", dtype="str").save()

Feature(uid='fs0IM3re4l8n', name='output_artifact_key', dtype='str', array_rank=0, array_size=0, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-06-03 11:33:03 UTC)

Define a function and decorate it with tracked():

@ln.tracked()
def subset_dataframe(
    input_artifact_key: str,
    output_artifact_key: str,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> None:
    artifact = ln.Artifact.get(key=input_artifact_key)
    dataset = artifact.load()
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    ln.Artifact.from_df(new_data, key=output_artifact_key).save()

Prepare a test dataset:

df = ln.core.datasets.small_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_df(df, key=input_artifact_key).save()

Run the function with default params:

ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key)

Query for the output:

subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()

_images/9aac521696030180750122eee118aa499aa823ad48a036addcf71d9e28bb153c.svg

This is the run that created the subsetted_artifact:

subsetted_artifact.run

Run(uid='yJl4FOYLQnAW6Oub', started_at=2025-06-03 11:33:03 UTC, finished_at=2025-06-03 11:33:03 UTC, branch_id=1, space_id=1, transform_id=3, created_by_id=1, initiated_by_run_id=1, created_at=2025-06-03 11:33:03 UTC)

This is the function that created it:

subsetted_artifact.run.transform

Transform(uid='DFbFVl7tlmAz0000', is_latest=True, key='track.ipynb/subset_dataframe.py', type='function', hash='F_wwrfFs6zmzMGVilG2Prg', branch_id=1, space_id=1, created_by_id=1, created_at=2025-06-03 11:33:03 UTC)

This is the source code of this function:

subsetted_artifact.run.transform.source_code

'@ln.tracked()\ndef subset_dataframe(\n    input_artifact_key: str,\n    output_artifact_key: str,\n    subset_rows: int = 2,\n    subset_cols: int = 2,\n) -> None:\n    artifact = ln.Artifact.get(key=input_artifact_key)\n    dataset = artifact.load()\n    new_data = dataset.iloc[:subset_rows, :subset_cols]\n    ln.Artifact.from_df(new_data, key=output_artifact_key).save()\n'

These are all versions of this function:

subsetted_artifact.run.transform.versions.df()

	uid	key	description	type	source_code	hash	reference	reference_type	space_id	_template_id	version	is_latest	created_at	created_by_id	_aux	branch_id
id
3	DFbFVl7tlmAz0000	track.ipynb/subset_dataframe.py	None	function	@ln.tracked()\ndef subset_dataframe(\n inpu...	F_wwrfFs6zmzMGVilG2Prg	None	None	1	None	None	True	2025-06-03 11:33:03.838000+00:00	1	None	1

This is the initating run that triggered the function call:

subsetted_artifact.run.initiated_by_run

Run(uid='3uaCCcz7qchuDKC6', started_at=2025-06-03 11:32:56 UTC, branch_id=1, space_id=1, transform_id=1, created_by_id=1, created_at=2025-06-03 11:32:56 UTC)

This is the transform of the initiating run:

subsetted_artifact.run.initiated_by_run.transform

Transform(uid='lPGHfnTi6kow0000', is_latest=True, key='track.ipynb', description='Track notebooks, scripts & functions', type='notebook', branch_id=1, space_id=1, created_by_id=1, created_at=2025-06-03 11:32:56 UTC)

These are the parameters of the run:

subsetted_artifact.run.features.get_values()

{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_cols': 2,
 'subset_rows': 2}

These input artifacts:

subsetted_artifact.run.input_artifacts.df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
4	a25o6DzknEZNYGCB0000	my_analysis/dataset.parquet	None	.parquet	dataset	DataFrame	9868	Rr1T1Z_dH4OFekK6Y6AZzQ	None	3	md5	True	False	1	1	None	None	True	1	2025-06-03 11:33:03.815000+00:00	1	None	1

These are output artifacts:

subsetted_artifact.run.output_artifacts.df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
5	6H9S7BRvp7aFw7Cy0000	my_analysis/dataset_subsetted.parquet	None	.parquet	dataset	DataFrame	3238	dNHL-WWN3PCVS9pyW8pKHA	None	2	md5	True	False	1	1	None	None	True	3	2025-06-03 11:33:03.890000+00:00	1	None	1

Re-run the function with a different parameter:

subsetted_artifact = subset_dataframe(
    input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()

We created a new run:

subsetted_artifact.run

Run(uid='c8eaSJXhL1gJocvp', started_at=2025-06-03 11:33:04 UTC, finished_at=2025-06-03 11:33:04 UTC, branch_id=1, space_id=1, transform_id=3, created_by_id=1, initiated_by_run_id=1, created_at=2025-06-03 11:33:04 UTC)

With new parameters:

subsetted_artifact.run.features.get_values()

{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_cols': 3,
 'subset_rows': 2}

And a new version of the output artifact:

subsetted_artifact.run.output_artifacts.df()

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
6	6H9S7BRvp7aFw7Cy0001	my_analysis/dataset_subsetted.parquet	None	.parquet	dataset	DataFrame	3852	gBAN-0lok9-D61VnFuHrAA	None	2	md5	True	False	1	1	None	None	True	4	2025-06-03 11:33:04.381000+00:00	1	None	1

See the state of the database:

ln.view()

Show code cell output Hide code cell output

Artifact

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
6	6H9S7BRvp7aFw7Cy0001	my_analysis/dataset_subsetted.parquet	None	.parquet	dataset	DataFrame	3852	gBAN-0lok9-D61VnFuHrAA	None	2.0	md5	True	False	1	1	None	None	True	4.0	2025-06-03 11:33:04.381000+00:00	1	None	1
5	6H9S7BRvp7aFw7Cy0000	my_analysis/dataset_subsetted.parquet	None	.parquet	dataset	DataFrame	3238	dNHL-WWN3PCVS9pyW8pKHA	None	2.0	md5	True	False	1	1	None	None	False	3.0	2025-06-03 11:33:03.890000+00:00	1	None	1
4	a25o6DzknEZNYGCB0000	my_analysis/dataset.parquet	None	.parquet	dataset	DataFrame	9868	Rr1T1Z_dH4OFekK6Y6AZzQ	None	3.0	md5	True	False	1	1	None	None	True	1.0	2025-06-03 11:33:03.815000+00:00	1	None	1
3	u07XW2oIe1zuVOpt0000	None	log streams of run 2feNkI5AUb7w3Be4	.txt	__lamindb_run__	None	0	1B2M2Y8AsgTpgAmY7PhCfg	None	NaN	md5	True	False	1	1	None	None	True	NaN	2025-06-03 11:33:03.215000+00:00	1	None	1
2	0ijq7lqqFomKAnq20000	None	requirements.txt	.txt	__lamindb_run__	None	4116	aBb6D9c7xSKLS29QkFZ55g	None	NaN	md5	True	False	1	1	None	None	True	NaN	2025-06-03 11:33:03.208000+00:00	1	None	1
1	ma8gmwBVjzvO6xmo0000	my_file.fcs	None	.fcs	None	None	19330507	rCPvmZB19xs4zHZ7p_-Wrg	None	NaN	md5	True	False	1	1	None	None	True	1.0	2025-06-03 11:32:59.134000+00:00	1	None	1

Feature

	uid	name	dtype	is_type	unit	description	array_rank	array_size	array_shape	proxy_dtype	synonyms	_expect_many	_curation	space_id	type_id	run_id	created_at	created_by_id	_aux	branch_id
id
7	fs0IM3re4l8n	output_artifact_key	str	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:33:03.761000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
6	ytIshNldZwVI	input_artifact_key	str	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:33:03.752000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
5	tu5WBU1e9gT4	subset_cols	int	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:33:03.743000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
4	WxELAsAIzY8w	subset_rows	int	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:33:03.732000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
3	ffzfKiXm35Pk	preprocess_params	dict	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:32:59.304000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
2	Akxki4JpcsEz	learning_rate	float	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:32:59.294000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
1	jGnrSFjEVZIv	input_dir	str	None	None	None	0	0	None	None	None	None	None	1	None	1	2025-06-03 11:32:59.282000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1

FeatureValue

	value	hash	space_id	feature_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	./mydataset	71I4KdtOlqWZYoR9KaVTvw	1	1	NaN	2025-06-03 11:33:02.042000+00:00	1	None	1
2	0.01	BIF-_RHBU2Sm7COXgAOIYg	1	2	NaN	2025-06-03 11:33:02.044000+00:00	1	None	1
3	{'downsample': True, 'normalization': 'the_goo...	4ehQH8UO25aNM181K_gloQ	1	3	NaN	2025-06-03 11:33:02.047000+00:00	1	None	1
4	2	yB5yjZ1ML2NvBn-JzBSGLA	1	4	1.0	2025-06-03 11:33:03.858000+00:00	1	None	1
5	2	yB5yjZ1ML2NvBn-JzBSGLA	1	5	1.0	2025-06-03 11:33:03.861000+00:00	1	None	1
6	my_analysis/dataset.parquet	1ImgyYl4KlCl3XCd-aQE9Q	1	6	1.0	2025-06-03 11:33:03.863000+00:00	1	None	1
7	my_analysis/dataset_subsetted.parquet	G9luXJ51Hi4-Csrifos0Lw	1	7	1.0	2025-06-03 11:33:03.865000+00:00	1	None	1

Project

	uid	name	is_type	abbr	url	start_date	end_date	_status_code	space_id	type_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	oF7eMq3zDyLe	My project	False	None	None	None	None	0	1	None	None	2025-06-03 11:32:55.765000+00:00	1	None	1

Run

	uid	name	started_at	finished_at	reference	reference_type	_is_consecutive	_status_code	space_id	transform_id	report_id	_logfile_id	environment_id	initiated_by_run_id	created_at	created_by_id	_aux	branch_id
id
1	3uaCCcz7qchuDKC6	None	2025-06-03 11:32:56.706805+00:00	NaT	None	None	None	0	1	1	NaN	None	NaN	NaN	2025-06-03 11:32:56.707000+00:00	1	None	1
2	2feNkI5AUb7w3Be4	None	2025-06-03 11:33:02.023664+00:00	2025-06-03 11:33:03.211575+00:00	None	None	True	0	1	2	3.0	None	2.0	NaN	2025-06-03 11:33:02.024000+00:00	1	None	1
3	yJl4FOYLQnAW6Oub	None	2025-06-03 11:33:03.843407+00:00	2025-06-03 11:33:03.896691+00:00	None	None	None	0	1	3	NaN	None	NaN	1.0	2025-06-03 11:33:03.843000+00:00	1	None	1
4	c8eaSJXhL1gJocvp	None	2025-06-03 11:33:04.331674+00:00	2025-06-03 11:33:04.388109+00:00	None	None	None	0	1	3	NaN	None	NaN	1.0	2025-06-03 11:33:04.332000+00:00	1	None	1

Storage

	uid	root	description	type	region	instance_uid	space_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	9HGwj5zU9I3h	/home/runner/work/lamindb/lamindb/docs/test-track	None	local	None	73KPGC58ahU9	1	None	2025-06-03 11:32:52.421000+00:00	1	None	1

Transform

	uid	key	description	type	source_code	hash	reference	reference_type	space_id	_template_id	version	is_latest	created_at	created_by_id	_aux	branch_id
id
3	DFbFVl7tlmAz0000	track.ipynb/subset_dataframe.py	None	function	@ln.tracked()\ndef subset_dataframe(\n inpu...	F_wwrfFs6zmzMGVilG2Prg	None	None	1	None	None	True	2025-06-03 11:33:03.838000+00:00	1	None	1
2	PlNqv9GNO22b0000	run_track_with_params.py	run_track_with_params.py	script	import argparse\nimport lamindb as ln\n\nif __...	nRUs3ZjuVTbKtBmSXpVQ5A	None	None	1	None	None	True	2025-06-03 11:33:02.013000+00:00	1	None	1
1	lPGHfnTi6kow0000	track.ipynb	Track notebooks, scripts & functions	notebook	None	None	None	None	1	None	None	True	2025-06-03 11:32:56.695000+00:00	1	None	1

In a script¶

run_workflow.py¶

import argparse
import lamindb as ln

ln.Param(name="run_workflow_subset", dtype=bool).save()


@ln.tracked()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
    run: ln.Run | None = None,
) -> ln.Artifact:
    dataset = artifact.load(is_run_input=run)
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_df(new_data, key=new_key, run=run).save()


if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--subset", action="store_true")
    args = p.parse_args()

    params = {"run_workflow_subset": args.subset}

    ln.track(params=params)

    if args.subset:
        df = ln.core.datasets.small_dataset1(otype="DataFrame")
        artifact = ln.Artifact.from_df(df, key="my_analysis/dataset.parquet").save()
        subsetted_artifact = subset_dataframe(artifact)

    ln.finish()

!python scripts/run_workflow.py --subset

ln.view()

Show code cell output Hide code cell output

Artifact

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
6	6H9S7BRvp7aFw7Cy0001	my_analysis/dataset_subsetted.parquet	None	.parquet	dataset	DataFrame	3852	gBAN-0lok9-D61VnFuHrAA	None	2.0	md5	True	False	1	1	None	None	True	4.0	2025-06-03 11:33:04.381000+00:00	1	None	1
5	6H9S7BRvp7aFw7Cy0000	my_analysis/dataset_subsetted.parquet	None	.parquet	dataset	DataFrame	3238	dNHL-WWN3PCVS9pyW8pKHA	None	2.0	md5	True	False	1	1	None	None	False	3.0	2025-06-03 11:33:03.890000+00:00	1	None	1
4	a25o6DzknEZNYGCB0000	my_analysis/dataset.parquet	None	.parquet	dataset	DataFrame	9868	Rr1T1Z_dH4OFekK6Y6AZzQ	None	3.0	md5	True	False	1	1	None	None	True	1.0	2025-06-03 11:33:03.815000+00:00	1	None	1
3	u07XW2oIe1zuVOpt0000	None	log streams of run FkNVPCIKs9tgYjZ6	.txt	__lamindb_run__	None	0	1B2M2Y8AsgTpgAmY7PhCfg	None	NaN	md5	True	False	1	1	None	None	True	NaN	2025-06-03 11:33:03.215000+00:00	1	None	1
2	0ijq7lqqFomKAnq20000	None	requirements.txt	.txt	__lamindb_run__	None	4116	aBb6D9c7xSKLS29QkFZ55g	None	NaN	md5	True	False	1	1	None	None	True	NaN	2025-06-03 11:33:03.208000+00:00	1	None	1
1	ma8gmwBVjzvO6xmo0000	my_file.fcs	None	.fcs	None	None	19330507	rCPvmZB19xs4zHZ7p_-Wrg	None	NaN	md5	True	False	1	1	None	None	True	1.0	2025-06-03 11:32:59.134000+00:00	1	None	1

Feature

	uid	name	dtype	is_type	unit	description	array_rank	array_size	array_shape	proxy_dtype	synonyms	_expect_many	_curation	space_id	type_id	run_id	created_at	created_by_id	_aux	branch_id
id
8	bqEiHpVfzSM8	run_workflow_subset	bool	None	None	None	0	0	None	None	None	None	None	1	None	NaN	2025-06-03 11:33:07.242000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
7	fs0IM3re4l8n	output_artifact_key	str	None	None	None	0	0	None	None	None	None	None	1	None	1.0	2025-06-03 11:33:03.761000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
6	ytIshNldZwVI	input_artifact_key	str	None	None	None	0	0	None	None	None	None	None	1	None	1.0	2025-06-03 11:33:03.752000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
5	tu5WBU1e9gT4	subset_cols	int	None	None	None	0	0	None	None	None	None	None	1	None	1.0	2025-06-03 11:33:03.743000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
4	WxELAsAIzY8w	subset_rows	int	None	None	None	0	0	None	None	None	None	None	1	None	1.0	2025-06-03 11:33:03.732000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
3	ffzfKiXm35Pk	preprocess_params	dict	None	None	None	0	0	None	None	None	None	None	1	None	1.0	2025-06-03 11:32:59.304000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1
2	Akxki4JpcsEz	learning_rate	float	None	None	None	0	0	None	None	None	None	None	1	None	1.0	2025-06-03 11:32:59.294000+00:00	1	{'af': {'0': None, '1': True, '2': False}}	1

FeatureValue

	value	hash	space_id	feature_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	./mydataset	71I4KdtOlqWZYoR9KaVTvw	1	1	NaN	2025-06-03 11:33:02.042000+00:00	1	None	1
2	0.01	BIF-_RHBU2Sm7COXgAOIYg	1	2	NaN	2025-06-03 11:33:02.044000+00:00	1	None	1
3	{'downsample': True, 'normalization': 'the_goo...	4ehQH8UO25aNM181K_gloQ	1	3	NaN	2025-06-03 11:33:02.047000+00:00	1	None	1
4	2	yB5yjZ1ML2NvBn-JzBSGLA	1	4	1.0	2025-06-03 11:33:03.858000+00:00	1	None	1
5	2	yB5yjZ1ML2NvBn-JzBSGLA	1	5	1.0	2025-06-03 11:33:03.861000+00:00	1	None	1
6	my_analysis/dataset.parquet	1ImgyYl4KlCl3XCd-aQE9Q	1	6	1.0	2025-06-03 11:33:03.863000+00:00	1	None	1
7	my_analysis/dataset_subsetted.parquet	G9luXJ51Hi4-Csrifos0Lw	1	7	1.0	2025-06-03 11:33:03.865000+00:00	1	None	1

Project

	uid	name	is_type	abbr	url	start_date	end_date	_status_code	space_id	type_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	oF7eMq3zDyLe	My project	False	None	None	None	None	0	1	None	None	2025-06-03 11:32:55.765000+00:00	1	None	1

Run

	uid	name	started_at	finished_at	reference	reference_type	_is_consecutive	_status_code	space_id	transform_id	report_id	_logfile_id	environment_id	initiated_by_run_id	created_at	created_by_id	_aux	branch_id
id
1	3uaCCcz7qchuDKC6	None	2025-06-03 11:32:56.706805+00:00	NaT	None	None	None	0	1	1	NaN	None	NaN	NaN	2025-06-03 11:32:56.707000+00:00	1	None	1
2	2feNkI5AUb7w3Be4	None	2025-06-03 11:33:02.023664+00:00	2025-06-03 11:33:03.211575+00:00	None	None	True	0	1	2	3.0	None	2.0	NaN	2025-06-03 11:33:02.024000+00:00	1	None	1
3	yJl4FOYLQnAW6Oub	None	2025-06-03 11:33:03.843407+00:00	2025-06-03 11:33:03.896691+00:00	None	None	None	0	1	3	NaN	None	NaN	1.0	2025-06-03 11:33:03.843000+00:00	1	None	1
4	c8eaSJXhL1gJocvp	None	2025-06-03 11:33:04.331674+00:00	2025-06-03 11:33:04.388109+00:00	None	None	None	0	1	3	NaN	None	NaN	1.0	2025-06-03 11:33:04.332000+00:00	1	None	1
5	FkNVPCIKs9tgYjZ6	None	2025-06-03 11:33:07.256603+00:00	2025-06-03 11:33:08.220820+00:00	None	None	True	0	1	4	3.0	None	2.0	NaN	2025-06-03 11:33:07.257000+00:00	1	None	1
6	rxF1o2ePiIob78cP	None	2025-06-03 11:33:08.169258+00:00	2025-06-03 11:33:08.216938+00:00	None	None	None	0	1	5	NaN	None	NaN	5.0	2025-06-03 11:33:08.169000+00:00	1	None	1

Storage

	uid	root	description	type	region	instance_uid	space_id	run_id	created_at	created_by_id	_aux	branch_id
id
1	9HGwj5zU9I3h	/home/runner/work/lamindb/lamindb/docs/test-track	None	local	None	73KPGC58ahU9	1	None	2025-06-03 11:32:52.421000+00:00	1	None	1

Transform

	uid	key	description	type	source_code	hash	reference	reference_type	space_id	_template_id	version	is_latest	created_at	created_by_id	_aux	branch_id
id
5	qpx5RbVMBvaR0000	run_workflow.py/subset_dataframe.py	None	function	@ln.tracked()\ndef subset_dataframe(\n arti...	Dqbr_hMfHs17EhbPXP_PyQ	None	None	1	None	None	True	2025-06-03 11:33:08.166000+00:00	1	None	1
4	rviyX3HPYXrx0000	run_workflow.py	run_workflow.py	script	import argparse\nimport lamindb as ln\n\nln.Pa...	yqr8j5hTUulVRzv4J-o9SQ	None	None	1	None	None	True	2025-06-03 11:33:07.254000+00:00	1	None	1
3	DFbFVl7tlmAz0000	track.ipynb/subset_dataframe.py	None	function	@ln.tracked()\ndef subset_dataframe(\n inpu...	F_wwrfFs6zmzMGVilG2Prg	None	None	1	None	None	True	2025-06-03 11:33:03.838000+00:00	1	None	1
2	PlNqv9GNO22b0000	run_track_with_params.py	run_track_with_params.py	script	import argparse\nimport lamindb as ln\n\nif __...	nRUs3ZjuVTbKtBmSXpVQ5A	None	None	1	None	None	True	2025-06-03 11:33:02.013000+00:00	1	None	1
1	lPGHfnTi6kow0000	track.ipynb	Track notebooks, scripts & functions	notebook	None	None	None	None	1	None	None	True	2025-06-03 11:32:56.695000+00:00	1	None	1

Sync scripts with git¶

To sync with your git commit, add the following line to your script:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

synced_with_git.py¶

import lamindb as ln

ln.settings.sync_git_repo = "https://github.com/..."
ln.track()
# your code
ln.finish()

Manage notebook templates¶

A notebook acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

label using ULabel, e.g., transform.ulabels.add(template_label)
tag with an indicative version string, e.g., transform.version = "T1"; transform.save()