1/23/2024 0 Comments Read in json file elixir ecto![]() ![]() The same for the module column in module_rank. In this case, the function_rank weighs the the function name column higher than all the rest. ![]() SQLite uses the bm25 algorithm for estimating a documents relevance based on a query. Then we set up a base query, with a function_rank, module_rank and a where clause limiting us to the search term. because it will return no results if there is an ending period. We also want spaces to be inclusive so we replace them with +. SQLite FTS5 is a sharp tool if you query something and don’t manually quote it, it will blow up with incomprehensible errors. Schema "packages" do field :meta, :map field :name, :string field :docs_html_url, :string field :downloads_all, :integer, default: 0 field :downloads_day, :integer, default: 0 field :downloads_recent, :integer, default: 0 field :downloads_week, :integer, default: 0 field :latest_docs_url, :string field :html_url, :string field :latest_stable_version, :string field :last_pulled, :utc_datetime field :search_items_json, , where: fragment ( "packages_index MATCH ?", ^ term_quoted ) ) preload_query = from ( p in Package, select: ) # See Part 2 end Store this into a SQLITE Table E: Get the data We’re going to parse that JSON using Jason, if that fails use a hand rolled Json parser that works with the cases that don’t work with Jason. We’re going to strip off all JavaScript in that file till we just have json, and then clean it up. Scrape the Search Page of every package that has documentation looking for an script tag that contains search_items and download that file Use the Hex API get a listing of all the packagesĮ. So we don’t get lost in the weed’s here is a little road map to keep us on tract.Į. Much like the “Real World” this part of the project kept expanding in scope as I went along. This project started as a fun project that scratches an itch for me, but building out an ETL pipeline is as “Real World” as programming gets. Garden Variety Extract, Transform and Load Pipeline The good news is the “items” all have the same document shape, the, meh news is this filename is different for every deploy. Which means that when ExDoc creates the above searchData.js file it has to encode the JSON directly. So it can’t really depend on any project that’s not Elixir or fully vendored, like Lunr.js. And because ExDoc has a special place in the world of Elixir, every Elixir project depends on it, including Elixir. We’re going to take advantage of that Search Data that comes with every hex package, and it looks like this:ĮxDoc generates this for every package and uploads it to HexDocs for you. Then there is some more JavaScript that does the autocomplete and rendering of the search fully in the browser. When the ExDoc generated page is first visited it downloads a JavaScript payload that contains the “Search Data”, which is then indexed in the browser via Lunrjs, compressed and stored in the browser SessionStorage. When you visit a ExDoc package and try using the search box, what you are using is a fully in-browser search experience provided by Lunrjs. ![]() In this post we will walk through how I downloaded, cleaned up, indexed and searched all of HexDocs using SQLite FTS5 and LiveView! It turned into a project that matches the real world incredibly well, warts and all! So I won’t be walking through every single step, but check it out on Github and try it out hosted here on Fly.io, let’s begin! Downloadin’ Hex Docs What if we had one spot we could go to and search all of HexDocs? That’s because it’s in a different project called ecto_sql. My favorite example is that I know that Ecto.Repo has a function to dump the generated SQL to a string, but if you search inside of Ecto you get nothing. My one complaint is that it can sometimes be hard to find exactly what you are looking for using ExDoc’s built in search. I love ExDocs, I think they are one of the best parts about the Elixir Ecosystem and I frankly cannot shut up about it. Fly.io happens to be a great place to run Phoenix applications. We run apps for our users on hardware we host around the world.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |