Performance idea: generator for object-style representation

In the past we've raised some issues about `peppy performance` (See #388 #387). Peppy is fine for small projects (hundreds or even thousands of sample rows, but it gets slow when we are dealing with huge projects, like tens to hundreds of thousands of samples.

It would be nice if peppy could handle these very large projects.

One of the problems is that peppy is storing sample information in two forms: a table (as a pandas data frame object), and as a list of `Sample` objects. This is duplicating the information in memory.

An idea for improving the performance could be to switch to a single-memory model. But we really want to be able to access the metadata in both ways for different use cases... so what about using the pandas data.frame as the main data structure, and then providing some kind of a generator that could go through it and create objects on the fly, in case someone wants the list-based approach?

This could be one way to increase performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance idea: generator for object-style representation #432

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance idea: generator for object-style representation #432

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions