Releases: databricks/lilac
v0.3.9
What's Changed
Other Changes
- Add
skip_noisy_assignment
todataset.cluster
by @dsmilkov in https://github.com/lilacai/lilac/pull/1194 - Fix a bug with excess RAM usage during vector computes. by @nsthorat in https://github.com/lilacai/lilac/pull/1195
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.8...v0.3.9
v0.3.8
What's Changed
Other Changes
- Fix llama-index test after upgrading deps by @dsmilkov in https://github.com/lilacai/lilac/pull/1192
- Respect the
self._split
param when computing embeddings for a text by @dsmilkov in https://github.com/lilacai/lilac/pull/1193
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.7...v0.3.8
v0.3.7
What's Changed
Other Changes
- Fix several FOSSA security vulnerabilities. by @nsthorat in https://github.com/lilacai/lilac/pull/1191
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.6...v0.3.7
v0.3.6
What's Changed
Other Changes
- Add format selectors to the compute clusters UI. by @nsthorat in https://github.com/lilacai/lilac/pull/1185
- Fix bug when theres no dataset format and we fail to cluster. by @nsthorat in https://github.com/lilacai/lilac/pull/1189
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.5...v0.3.6
v0.3.5
This release adds the Nomic 1.5 and bge-m3 embeddings as built-ins.
We also have made it easier to add selections to concepts:
add-to-concept.mp4
Features
- Support the FastAPI app being mounted. by @nsthorat in https://github.com/lilacai/lilac/pull/1174
- Add bge-m3 and Nomic 1.5 embeddings. by @nsthorat in https://github.com/lilacai/lilac/pull/1182
- Change link to selection => copy link to selection. by @nsthorat in https://github.com/lilacai/lilac/pull/1175
UI Changes
- Power-law histogram generations by @brilee in https://github.com/lilacai/lilac/pull/1165
Clustering
- Add support for calling mistral for titling (no public API yet) by @dsmilkov in https://github.com/lilacai/lilac/pull/1168
Demo
- Add GAIR-lima to the HF demo. by @nsthorat in https://github.com/lilacai/lilac/pull/1171
- Add ultrachat embeddings to public demo. by @nsthorat in https://github.com/lilacai/lilac/pull/1173
Bug fixes
- Format N/A values separately in histograms by @brilee in https://github.com/lilacai/lilac/pull/1169
- Fix bug with editing filters & keyword search of '' by @nsthorat in https://github.com/lilacai/lilac/pull/1180
- Fix issue with FastAPI mounts. by @nsthorat in https://github.com/lilacai/lilac/pull/1183
- Fix a few UI issues with concepts & UI bugs by @nsthorat in https://github.com/lilacai/lilac/pull/1179
Other Changes
- Variables for OpenAI API and Model by @drikster80 in https://github.com/lilacai/lilac/pull/1172
New Contributors
- @drikster80 made their first contribution in https://github.com/lilacai/lilac/pull/1172
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.4...v0.3.5
v0.3.4
This release adds task cancellation to the UI and fixes a set of bugs around exporting, and some UI weirdness.
Features
- Implement signal cancellation by @brilee in https://github.com/lilacai/lilac/pull/1154
- Create Cancelled task status by @brilee in https://github.com/lilacai/lilac/pull/1163
- Make the cancel button in the UI work. by @nsthorat in https://github.com/lilacai/lilac/pull/1162
Bug fixes
- Fix unusual inputs to auto binning histogram by @brilee in https://github.com/lilacai/lilac/pull/1151
- Fix errors when the concept is empty by @dsmilkov in https://github.com/lilacai/lilac/pull/1158
- Fix issue where by default a long media doesnt take up the full screen by @nsthorat in https://github.com/lilacai/lilac/pull/1159
- Fix some issues with exporting. by @nsthorat in https://github.com/lilacai/lilac/pull/1160
Public demo
- Add OpenHermes-2.5 to the public demo. by @nsthorat in https://github.com/lilacai/lilac/pull/1156
Docs
- Add the garden blog post by @dsmilkov in https://github.com/lilacai/lilac/pull/1144
- Update Quickstart (UI and python) by @brilee in https://github.com/lilacai/lilac/pull/1145
- Add links to Lilac Garden page by @brilee in https://github.com/lilacai/lilac/pull/1132
- Update docs - address PR comments from #1145 by @brilee in https://github.com/lilacai/lilac/pull/1147
- Tiny blog update (remove yet another logo) by @dsmilkov in https://github.com/lilacai/lilac/pull/1149
- Fix garden links by @dsmilkov in https://github.com/lilacai/lilac/pull/1157
Other Changes
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.3...v0.3.4
v0.3.3
This release is mostly bug fixes and one API change for exporting.
For all export methods, we now have an "include_signals" bit. By default, we do not export signals computed in Lilac as extra metadata to preserve your source data.
For example:
hf_ds = ds.to_huggingface(include_signals=True)
What's Changed
Clustering
- Cache dataset.pivot() and make cluster search box more visible by @dsmilkov in https://github.com/lilacai/lilac/pull/1126
- Add some polish to the clusters page. Fix some other UI bugs. by @nsthorat in https://github.com/lilacai/lilac/pull/1128
- Add progress bars for JINA embedding for local clustering by @brilee in https://github.com/lilacai/lilac/pull/1138
- Speedup rendering of cluster view by @dsmilkov in https://github.com/lilacai/lilac/pull/1137
Bug fixes
- Fix a few small bugs by testing prod mode by @dsmilkov in https://github.com/lilacai/lilac/pull/1129
- Add --deploy_at_head to deploy_project. Fix bug with percentages. by @nsthorat in https://github.com/lilacai/lilac/pull/1131
- Rename USE_TABLE_INDEX => LILAC_USE_TABLE_INDEX. Add LILAC_PROD_MODE. by @nsthorat in https://github.com/lilacai/lilac/pull/1134
- Fix signal info error by @dsmilkov in https://github.com/lilacai/lilac/pull/1136
- Improve clusters and several bug fixes by @dsmilkov in https://github.com/lilacai/lilac/pull/1141
- Fix issue with urls that end with a slash. by @nsthorat in https://github.com/lilacai/lilac/pull/1143
Docs
- Update readme.md by @dsmilkov in https://github.com/lilacai/lilac/pull/1133
Demo
- Add hacker news comments to the public demo. by @nsthorat in https://github.com/lilacai/lilac/pull/1135
Performance
- Handle sqlite files separately during table/index creation by @brilee in https://github.com/lilacai/lilac/pull/1140
API
- Add exclude_signals to select_rows, and include_signals to export methods. by @nsthorat in https://github.com/lilacai/lilac/pull/1139
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.2...v0.3.3
v0.3.2
This release is mostly bug fixes.
Bug fixes
- Fix pandas deprecation warning by @brilee in https://github.com/lilacai/lilac/pull/1123
- Fix "open dataset and apply concept" by @dsmilkov in https://github.com/lilacai/lilac/pull/1124
- Fix concept labeler when we index a repeated string (capybara) by @dsmilkov in https://github.com/lilacai/lilac/pull/1122
- Fix a few bugs related to concepts and clustering by @dsmilkov in https://github.com/lilacai/lilac/pull/1121
Lilac Garden & Clustering
- Add mosaic-instruct-v3 by @brilee in https://github.com/lilacai/lilac/pull/1116
- Add eval datasets to the huggingface demo. by @nsthorat in https://github.com/lilacai/lilac/pull/1119
- Add more demo datasets. by @nsthorat in https://github.com/lilacai/lilac/pull/1120
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.1...v0.3.2
v0.3.1
Bugs
- Fix a non blocking start_server by @dsmilkov in https://github.com/lilacai/lilac/pull/1117
Docs
- Add a clustering guide by @dsmilkov in https://github.com/lilacai/lilac/pull/1114
Full Changelog: https://github.com/lilacai/lilac/compare/v0.3.0...v0.3.1
v0.3.0
This release extends our exporting capabilities and adds support for loading custom embeddings.
Because the shape of exported data has changed, this is a breaking change so we released 0.3.0.
Loading custom embeddings
Loading pre-computed embeddings from an external source is now possible. See our Custom embeddings guide for more details.
# Load the embeddings into Lilac.
def _load_embedding(item):
return vector_store[item['id']]
# Load the embeddings into Lilac.
ds.load_embedding(
load_fn=_load_embedding, index_path='text', embedding='my_embedding', overwrite=True
)
Export to HuggingFace
You can now export to a HuggingFace dataset.
# Export a Lilac dataset to a huggingface dataset.
hf_ds = ds.to_huggingface()
# Optionally: use the HuggingFace API to push the dataset to the hub.
hf_ds.push_to_hub('lilacai/glaive-function-calling-v2-sharegpt')
Exporting no longer flattens data
Before this release, exporting would flatten source data. For instance, data that looks like:
{
'conversations': [{
'from': 'user',
'value': 'Hello there'
}]
Would get exported incorrectly as:
{'conversations.*.from': ['user'], 'conversations.*.value': ['Hello there']}
Now it is exported exactly the way it was shaped when importing.
What's Changed
Features
- Add support for loading custom embeddings. by @nsthorat in https://github.com/lilacai/lilac/pull/1090
- Fix dataset export to avoid flattening the user data by @dsmilkov in https://github.com/lilacai/lilac/pull/1091
- Export to HuggingFace. Support glaive-function-calling-v2 in the demo, clusters, and via sharegpt. by @nsthorat in https://github.com/lilacai/lilac/pull/1113
Performance
- Speed up PII and lang detection by making them multiprocess by @dsmilkov in https://github.com/lilacai/lilac/pull/1097
Bug fixes
- Bug fixes: overwrite, task errors, embedding keys. by @nsthorat in https://github.com/lilacai/lilac/pull/1098
- Fixed cache busting behavior by @brilee in https://github.com/lilacai/lilac/pull/1099
- Small fixes for the demo. by @nsthorat in https://github.com/lilacai/lilac/pull/1106
- Fix a bug where we drop source fields that have embeddings computed on them by @nsthorat in https://gith* Couple of small bug fixes by @dsmilkov in https://github.com/lilacai/lilac/pull/1109
ub.com/lilacai/lilac/pull/1093 - Fix edge case where table doesn't exist and doesn't get created by @brilee in https://github.com/lilacai/lilac/pull/1110
- Fix the cluster sort by membership score bug by @dsmilkov in https://github.com/lilacai/lilac/pull/1112
Lilac Garden
- Rename remote => use_garden. by @nsthorat in https://github.com/lilacai/lilac/pull/1092
- Fix chunking bug for remote embedding computation by @dsmilkov in https://github.com/lilacai/lilac/pull/1096
- Add accelerated PII execution on Lilac Garden by @dsmilkov in https://github.com/lilacai/lilac/pull/1103
- Move use_garden outside of a Signal. by @nsthorat in https://github.com/lilacai/lilac/pull/1102
UI
- Refactor buttons so we have a single cluster button. by @nsthorat in https://github.com/lilacai/lilac/pull/1111
Full Changelog: https://github.com/lilacai/lilac/compare/v0.2.5...v0.3.0