gitmyhub

bleve

Go ★ 11k updated 2d ago

A modern text/numeric/geo-spatial/vector indexing library for go

Bleve is a Go library you embed directly in your app to add full-text search, including fuzzy matching, geographic search, and vector similarity, with no separate search server required.

Gosetup: moderatecomplexity 3/5

Bleve is a search and indexing library for the Go programming language. You embed it into a Go application to add full-text search capabilities, without needing a separate search server running alongside your app. You give it your Go data structures or JSON documents, and it builds a local index on disk that you can query with a range of search techniques.

The library supports several field types: text, numbers, dates, booleans, geographic points and shapes, IP addresses, and vectors. The query options are broad. You can search for exact terms, phrases, prefix matches, wildcard patterns, fuzzy matches (which tolerate slight misspellings), and regular expressions. Range queries work across numeric values and date ranges. Compound queries let you combine conditions with AND, OR, and NOT logic. There is also geographic search for finding results within a distance or shape, approximate nearest-neighbor vector search for semantic similarity, synonym expansion, and hierarchical nested document search.

Bleve uses two standard relevance scoring models, tf-idf and BM25, and supports hybrid search that combines exact text matching with vector-based semantic matching using fusion methods to merge the two sets of results. Search results include match highlighting, pagination, and faceting, which is the feature that shows counts like how many results fall into each category or date range.

A command-line tool is included for managing indexes directly from the terminal. It can create indexes, add documents from JSON files, count documents, inspect field mappings, and run queries. Text analysis is built in for more than 30 languages, from English and French to Arabic, Chinese, Japanese, Korean, Russian, and Turkish.

The project is available under the Apache 2.0 license. Discussion and issue tracking happen on GitHub and a Google group.

Where it fits