<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>l1x/dev</title>
    <link>https://dev.l1x.be</link>
    <description>Thoughts on DevOps, Rust, Unix, data engineering and high performance computing</description>
    <language>en</language>
    <managingEditor>Istvan</managingEditor>
    <atom:link href="https://dev.l1x.be/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Fri, 07 Feb 2025 13:45:01 +0100</pubDate>
    <lastBuildDate>Thu, 19 Mar 2026 14:20:49 +0000</lastBuildDate>
    <item>
      <title>Processing CloudFront Logs with Rust</title>
      <link>https://dev.l1x.be/posts/2025/02/07/processing-cloudfront-logs-with-rust/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2025/02/07/processing-cloudfront-logs-with-rust/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/bear.webp" alt="Bear" /></p>
<h2 id="processing-data-with-rust"><a href="#processing-data-with-rust">Processing data with Rust</a></h2>
<p>In data processing, Rust has emerged as a compelling alternative to traditional ETL languages like Python or Java. Its unique combination of performance, safety, and modern features makes it an attractive choice for handling large-scale data streams and complex computational tasks. Libraries like Polars make Rust a good option for building data processing pipelines. With its growing ecosystem of libraries and frameworks, Rust is primed to play a significant role in shaping the future of data-intensive applications.</p>
<p>As a toy project, I explored how to use Rust to transform AWS CloudFront logs into Parquet format. This need is now obsolete as of February 2025. AWS has released a new logging stack that writes Parquet files directly to S3. There is no need to have a converter going forward. However, for the already existing files, a converter might be still useful.</p>
<h3 id="aws-cloudfront-logs"><a href="#aws-cloudfront-logs">AWS Cloudfront Logs</a></h3>
<p>Amazon CloudFront is a content delivery network (CDN) service offered by Amazon Web Services (AWS). CloudFront is designed to deliver content to users by caching it at edge locations around the world. This reduces the distance that the data must travel, and making the comunication faster. CloudFront has a feature to save the access logs to a S3 bucket using some formats, including TSV (tab-separated values).</p>
<p>This looks like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">12:58</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-11.84ac498a.gz</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">13:03</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-11.c34e4413.gz</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">13:08</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-12.db61b94a.gz</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">13:18</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-12.46529b24.gz</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">13:28</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-12.d77399a4.gz</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">13:33</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-12.6f1c7b5e.gz</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">13:58</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-12.a5e6ddca.gz</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">14:03</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-12.4d5e6dbc.gz</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">14:18</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-13.7125fd04.gz</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">Jan</span> <span style="color: #e6edf3;">14:38</span> <span style="color: #e6edf3;">E2F28FYJOwT1P9.2024-01-23-13.7d3dbdf0.gz</span>
</div></code></pre>
<p>Looking into a log file we can find the following (breaking the long lines ino shorter for better readability).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">gzcat</span> <span style="color: #e6edf3;">gz/E3F26FYJOTTOP6.2024-01-23-12.4d5e6dbc.gz</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">tr</span> <span style="color: #a5d6ff;">&#39;\t&#39;</span> <span style="color: #a5d6ff;">&#39; &#39;</span>
</div><div class="line" data-line="2"><span style="color: #8b949e;">#Version: 1.0</span>
</div><div class="line" data-line="3"><span style="color: #8b949e;">#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem </span>
</div><div class="line" data-line="4"> <span style="color: #d2a8ff;">sc-status</span> cs<span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">Referer</span><span style="color: #e6edf3;">)</span> <span style="color: #d2a8ff;">cs</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">User-Agent</span><span style="color: #e6edf3;">)</span> cs-uri-query <span style="color: #d2a8ff;">cs</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">Cookie</span><span style="color: #e6edf3;">)</span> x-edge-result-type 
</div><div class="line" data-line="5"> <span style="color: #d2a8ff;">x-edge-request-id</span> <span style="color: #e6edf3;">x-host-header</span> <span style="color: #e6edf3;">cs-protocol</span> <span style="color: #e6edf3;">cs-bytes</span> <span style="color: #e6edf3;">time-taken</span> <span style="color: #e6edf3;">x-forwarded-for</span> 
</div><div class="line" data-line="6"> <span style="color: #d2a8ff;">ssl-protocol</span> <span style="color: #e6edf3;">ssl-cipher</span> <span style="color: #e6edf3;">x-edge-response-result-type</span> <span style="color: #e6edf3;">cs-protocol-version</span> <span style="color: #e6edf3;">fle-status</span> 
</div><div class="line" data-line="7"> <span style="color: #d2a8ff;">fle-encrypted-fields</span> <span style="color: #e6edf3;">c-port</span> <span style="color: #e6edf3;">time-to-first-byte</span> <span style="color: #e6edf3;">x-edge-detailed-result-type</span> 
</div><div class="line" data-line="8"> <span style="color: #d2a8ff;">sc-content-type</span> <span style="color: #e6edf3;">sc-content-len</span> <span style="color: #e6edf3;">sc-range-start</span> <span style="color: #e6edf3;">sc-range-end</span>
</div><div class="line" data-line="9">
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">2024-01-23</span> <span style="color: #e6edf3;">12:59:15</span> <span style="color: #e6edf3;">LAX53-P1</span> <span style="color: #79c0ff;">1897</span> <span style="color: #e6edf3;">100.100.100.100</span> <span style="color: #e6edf3;">GET</span> <span style="color: #e6edf3;">d197w9j7zcw6og.cloudfront.net</span> 
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">/ai/riding-hood/</span> <span style="color: #79c0ff;">200</span> <span style="color: #e6edf3;">https://dev.l1x.be/ai/riding-hood/</span> 
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">Mozilla/5.0%20</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">Macintosh</span><span style="color: #e6edf3;">;</span><span style="color: #d2a8ff;">%20Intel%20Mac%20OS%20X%2010_15_7</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span><span style="color: #d2a8ff;">%20AppleWebKit/537.36%20</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">KHTML,%20like%20Gecko</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span><span style="color: #d2a8ff;">%20Chrome/114.0.0.0%20Safari/537.36</span> 
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">-</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">RefreshHit</span> <span style="color: #e6edf3;">UKrRsY4BzxJsshXB71YE7R2zxpoB3npsKy39TyIRKQJFw_DusKph4t==</span> 
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">dev.l1x.be</span> <span style="color: #e6edf3;">https</span> <span style="color: #79c0ff;">330</span> <span style="color: #e6edf3;">0.301</span> 
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">-</span> <span style="color: #e6edf3;">TLSv1.3</span> <span style="color: #e6edf3;">TLS_AES_128_GCM_SHA256</span> <span style="color: #e6edf3;">RefreshHit</span> <span style="color: #e6edf3;">HTTP/1.1</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">-</span> 
</div><div class="line" data-line="16"><span style="color: #79c0ff;">48634</span> <span style="color: #e6edf3;">0.301</span> <span style="color: #e6edf3;">RefreshHit</span> <span style="color: #e6edf3;">text/html</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">-</span>
</div></code></pre>
<p>It would be great to have a single file per day in a format, that supports applying summary statistics on the dataset. It is simple to add more data as a new day, with an ETL job that runs daily. This job might run hourly in high traffic environments like e-commerce sites for example.</p>
<h3 id="using-parquet"><a href="#using-parquet">Using Parquet</a></h3>
<p>After working with these file formats for a long time, I believe Parquet is an excellent option for long-term storage of data. It is a columnar storage format that is becoming increasingly popular in the data engineering world due to its numerous advantages.</p>
<p>Here are some of the key reasons why using Parquet is a good idea for data engineering:</p>
<ul>
<li>Reduced Storage Requirements:</li>
</ul>
<p>Parquet allows for more efficient data compression, resulting in significantly smaller file sizes compared to row-based formats like CSV. This is particularly beneficial for storing large datasets, as it can significantly reduce storage costs and the amount of disk space required. Combining this with compression, especially Zstandard (ZSTD), yields excellent results.</p>
<ul>
<li>Faster Query Performance Compared to Text Data Formats:</li>
</ul>
<p>Parquet enables faster query execution compared to text data formats, especially for complex analytical queries that involve aggregations or filtering on specific columns. Since Parquet only reads the required data (usually the header or specific columns), it minimizes disk I/O and improves processing speed. This makes it well-suited for large-scale data analysis and data warehousing applications.</p>
<p>Using the Rust Parquet lib might be a bit more complex for a simple use case like this one. Luckily we have <a href="https://pola.rs/">Polars</a>, an open-source library written in Rust (also has a Python interface) for data analysis and manipulation. It is well designed to be fast and efficient and has a <a href="https://docs.pola.rs/py-polars/html/reference/lazyframe/index.html">LazyFrame</a> for larger data sets.</p>
<h3 id="creating-a-workflow"><a href="#creating-a-workflow">Creating a workflow</a></h3>
<p>Reading the TSV file (actually reading and unzipping) is easy with Rust. One assumption I have is that the entire dataset fits in memory. In the future, I’d like to improve this by exploring lazy evaluation with iterators. For now, we can have a few hundred megabytes worth of TSV in memory. I believe it’s more important to understand when O(n) memory usage is acceptable rather than prematurely optimizing for stream-based processing.</p>
<h4 id="get-a-list-of-files-with-a-glob"><a href="#get-a-list-of-files-with-a-glob">Get a list of files with a glob</a></h4>
<p>Rust offers two main ways (and possibly more that I’m unaware of) to work with files:</p>
<ul>
<li>glob::glob</li>
<li>std::fs</li>
</ul>
<p>For flexible patterns with wildcards, opt for glob. It efficiently iterates over matching files, letting you perform actions like printing paths or collecting them into a vector. If fine-grained control over individual entries is needed, and globbing patterns are simpler, consider std::fs. It allows filtering directory entries based on criteria like file type or extension, giving you more granular control over the processing tasks. Both approaches handle errors and offer customization options. I opted for the glob create.</p>
<p>The files have date in the name (for example: <code>E2F28FYJOwT1P9.2024-01-23-11.84ac498a.gz</code>) and we can create a simple glob representing a single day in the dataset.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">use</span> <span style="color: #ff7b72;">glob</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">glob</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Paths</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">PatternError</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">get_files</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">date</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #ff7b72;">str</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Result</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Paths</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">PatternError</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">    <span style="color: #d2a8ff;">glob</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">&amp;</span><span style="color: #79c0ff;">format</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;gz/*.&lbrace;&rbrace;*.gz&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">date</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>The first approach is to simple iterate over the glob, ignore the errors for now and collect the file content in an accumulator (hence making this O(n) memory).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">path</span> <span style="color: #ff7b72;">in</span> <span style="color: #e6edf3;">paths</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">    <span style="color: #ff7b72;">if</span> <span style="color: #e6edf3;">path</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">is_err</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">        <span style="color: #79c0ff;">error</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;Could not process path with error: &lbrace;:?&rbrace;&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">path</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="4">        <span style="color: #ff7b72;">continue</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="5">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="6">    <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">path</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">PathBuf</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">path</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="7">    <span style="color: #79c0ff;">info</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;Processing path: &lbrace;:?&rbrace;&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">path</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="8">    <span style="color: #8b949e;">// panic if cannot unzip file</span>
</div><div class="line" data-line="9">    <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">tsv_maybe</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Result</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Error</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">uncompress_gzip</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">path</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">to_str</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="10">    <span style="color: #8b949e;">// the unzipped content is a potentially multiline TSV (tab separated values)</span>
</div><div class="line" data-line="11">    <span style="color: #ff7b72;">match</span> <span style="color: #e6edf3;">tsv_maybe</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="12">        <span style="color: #79c0ff;">Ok</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">tsv</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">=&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="13">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">entries</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">split_tsv</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">tsv</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="14">            <span style="color: #e6edf3;">acc</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">extend</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">entries</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="15">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="16">        <span style="color: #79c0ff;">Err</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">e</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">=&gt;</span> <span style="color: #79c0ff;">error</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;&lbrace;:?&rbrace;&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">e</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="17">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="18"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<h4 id="unzip-and-parse"><a href="#unzip-and-parse">Unzip and parse</a></h4>
<p>The next phase is unzipping the files in memory and creating a Vec<String> for each line. Assemling the lines together into a Vec&lt;Vec<String>&gt; is simple and this is exactly what</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">split_tsv</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">tsv</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">    <span style="color: #8b949e;">//This is a bit unreadable</span>
</div><div class="line" data-line="3">    <span style="color: #8b949e;">// - splitting the TSV to lines</span>
</div><div class="line" data-line="4">    <span style="color: #8b949e;">// - removing lines starting with # or being empty</span>
</div><div class="line" data-line="5">    <span style="color: #8b949e;">// - split the lines by tab</span>
</div><div class="line" data-line="6">    <span style="color: #8b949e;">// - convert &amp;str to String so it can be returned (maybe this is not necessary?)</span>
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">    <span style="color: #e6edf3;">tsv</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">split</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&#39;\n&#39;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="9">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">filter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">l</span><span style="color: #e6edf3;">|</span> <span style="color: #79c0ff;">!</span><span style="color: #e6edf3;">l</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">starts_with</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&#39;#&#39;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="10">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">filter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">l</span><span style="color: #e6edf3;">|</span> <span style="color: #79c0ff;">!</span><span style="color: #e6edf3;">l</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">is_empty</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="11">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">l</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="12">            <span style="color: #e6edf3;">l</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">split</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&#39;\t&#39;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="13">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">s</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">s</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">to_string</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="14">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="15">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="16">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="17"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>This function produces the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;2024-01-23&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;12:14:37&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;LAX53-P1&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">..</span><span style="color: #e6edf3;">.</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;2024-01-23&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;12:56:19&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;SOF50-C1&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">..</span><span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span>
</div></code></pre>
<p>After having a 2D vector like the one above, we are almost ready to write the dataset to its permanent storage, a local folder in this case. There is a small problem with this dataset, though. The API that Polars has for creating a dataframe does not match the vector we got from processing TSVs. Polars expects a dataset that looks like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">s1</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;Fruit&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;Apple&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;Apple&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;Pear&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">s2</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;Color&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;Red&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;Yellow&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;Green&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="3">
</div><div class="line" data-line="4"><span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">df</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">PolarsResult</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">DataFrame</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">DataFrame</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">vec</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">s1</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">s2</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div></code></pre>
<p>As you can see the new dataframe has a vector of Series. Each vector contains the same kind of data, for example, fruit names or colors. We can simply transpose the 2D vector we have to produce something similar.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">transpose</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">    <span style="color: #8b949e;">// Transposing a 2D matrix</span>
</div><div class="line" data-line="3">    <span style="color: #8b949e;">// https://stackoverflow.com/questions/64498617/how-to-transpose-a-vector-of-vectors-in-rust</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">    <span style="color: #79c0ff;">assert</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">!</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">.</span><span style="color: #e6edf3;">is_empty</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="6">    <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">len</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">len</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="7">    <span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">iters</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">_</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">into_iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">into_iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="8">    <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #e6edf3;">len</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="9">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">_</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="10">            <span style="color: #e6edf3;">iters</span>
</div><div class="line" data-line="11">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">iter_mut</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="12">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">next</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="13">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="14">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="15">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="16"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>I was not able to determine which SO answer was better just by reading the source codes. I decided that I would like to understand if there is a significant performance difference between these.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">use</span> <span style="color: #ff7b72;">criterion</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">black_box</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">criterion_group</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">criterion_main</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Criterion</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">transpose1</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">original</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="4">    <span style="color: #79c0ff;">assert</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">!</span><span style="color: #e6edf3;">original</span><span style="color: #e6edf3;">.</span><span style="color: #e6edf3;">is_empty</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="5">    <span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">transposed</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #e6edf3;">original</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">len</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">_</span><span style="color: #e6edf3;">|</span> <span style="color: #79c0ff;">vec</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">_</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">    <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">original_row</span> <span style="color: #ff7b72;">in</span> <span style="color: #e6edf3;">original</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="8">        <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">item</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">transposed_row</span><span style="color: #e6edf3;">)</span> <span style="color: #ff7b72;">in</span> <span style="color: #e6edf3;">original_row</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">into_iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">zip</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">&amp;</span><span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="9">            <span style="color: #e6edf3;">transposed_row</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">push</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">item</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="10">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="11">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">    <span style="color: #e6edf3;">transposed</span>
</div><div class="line" data-line="14"><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="15">
</div><div class="line" data-line="16"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">transpose2</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="17">    <span style="color: #79c0ff;">assert</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">!</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">.</span><span style="color: #e6edf3;">is_empty</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="18">    <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">len</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">len</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="19">    <span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">iters</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">_</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">into_iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">into_iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="20">    <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #e6edf3;">len</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="21">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">_</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="22">            <span style="color: #e6edf3;">iters</span>
</div><div class="line" data-line="23">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">iter_mut</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="24">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">n</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">next</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="25">                <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">T</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="26">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="27">        <span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="28"><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="29">
</div><div class="line" data-line="30"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">criterion_benchmark</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">crit</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #ff7b72;">mut</span> <span style="color: #ffa657;">Criterion</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="31">    <span style="color: #e6edf3;">crit</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">bench_function</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;transpose1&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="32">        <span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="33">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">a</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">+</span> <span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="34">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">+</span> <span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="35">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">c</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">+</span> <span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="36">
</div><div class="line" data-line="37">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">vx</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">vec</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">a</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">c</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="38">
</div><div class="line" data-line="39">            <span style="color: #d2a8ff;">transpose1</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">black_box</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">vx</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="40">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="41">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="42">
</div><div class="line" data-line="43">    <span style="color: #e6edf3;">crit</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">bench_function</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;transpose2&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="44">        <span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="45">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">a</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">+</span> <span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="46">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">+</span> <span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="47">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">c</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">0</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">v</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">+</span> <span style="color: #79c0ff;">1000</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="48">
</div><div class="line" data-line="49">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">vx</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">u64</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">vec</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">a</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">b</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">c</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="50">
</div><div class="line" data-line="51">            <span style="color: #d2a8ff;">transpose2</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">black_box</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">vx</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="52">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="53">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="54"><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="55">
</div><div class="line" data-line="56"><span style="color: #79c0ff;">criterion_group</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">benches</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">criterion_benchmark</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="57"><span style="color: #79c0ff;">criterion_main</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">benches</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div></code></pre>
<p>Invoking the test with cargo bench is easy and it produces an HTML output that is not that hard to understand. The two functions are very close to each other in terms of performance and I picked the slightly faster one.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1">❯ <span style="color: #e6edf3;">cargo</span> <span style="color: #e6edf3;">bench</span>
</div><div class="line" data-line="2">    <span style="color: #ffa657;">Finished</span> <span style="color: #e6edf3;">bench</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">optimized</span><span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">target</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">s</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">in</span> <span style="color: #79c0ff;">0.13</span><span style="color: #e6edf3;">s</span>
</div><div class="line" data-line="3">     <span style="color: #ffa657;">Running</span> <span style="color: #e6edf3;">unittests</span> <span style="color: #e6edf3;">src</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">main</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">rs</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">target</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">release</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">deps</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">cf_logs</span><span style="color: #79c0ff;">-</span><span style="color: #79c0ff;">5058</span><span style="color: #e6edf3;">ea9e2adfad2d</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5"><span style="color: #e6edf3;">running</span> <span style="color: #79c0ff;">1</span> <span style="color: #e6edf3;">test</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">test</span> <span style="color: #ff7b72;">tests</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">exploration_df</span> <span style="color: #79c0ff;">...</span> <span style="color: #e6edf3;">ignored</span>
</div><div class="line" data-line="7">
</div><div class="line" data-line="8"><span style="color: #e6edf3;">test</span> <span style="color: #e6edf3;">result</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ok</span><span style="color: #e6edf3;">.</span> <span style="color: #79c0ff;">0</span> <span style="color: #79c0ff;">passed</span><span style="color: #e6edf3;">;</span> <span style="color: #79c0ff;">0</span> <span style="color: #e6edf3;">failed</span><span style="color: #e6edf3;">;</span> <span style="color: #79c0ff;">1</span> <span style="color: #e6edf3;">ignored</span><span style="color: #e6edf3;">;</span> <span style="color: #79c0ff;">0</span> <span style="color: #e6edf3;">measured</span><span style="color: #e6edf3;">;</span> <span style="color: #79c0ff;">0</span> <span style="color: #e6edf3;">filtered</span> <span style="color: #e6edf3;">out</span><span style="color: #e6edf3;">;</span> <span style="color: #e6edf3;">finished</span> <span style="color: #e6edf3;">in</span> <span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">00</span><span style="color: #e6edf3;">s</span>
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">     <span style="color: #ffa657;">Running</span> <span style="color: #79c0ff;">benches</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">bench</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">rs</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">target</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">release</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">deps</span><span style="color: #79c0ff;">/</span><span style="color: #e6edf3;">bench</span><span style="color: #79c0ff;">-</span><span style="color: #79c0ff;">199f64</span><span style="color: #e6edf3;">d850bf6279</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="11"><span style="color: #ffa657;">Gnuplot</span> <span style="color: #e6edf3;">not</span> <span style="color: #e6edf3;">found</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">using</span> <span style="color: #e6edf3;">plotters</span> <span style="color: #e6edf3;">backend</span>
</div><div class="line" data-line="12"><span style="color: #e6edf3;">transpose1</span>              <span style="color: #e6edf3;">time</span><span style="color: #e6edf3;">:</span>   <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">34.802</span> <span style="color: #e6edf3;">µs</span> <span style="color: #79c0ff;">34</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">944</span> <span style="color: #e6edf3;">µs</span> <span style="color: #79c0ff;">35</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">087</span> <span style="color: #e6edf3;">µs</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="13">                        <span style="color: #e6edf3;">change</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">-</span><span style="color: #79c0ff;">0.2763</span><span style="color: #79c0ff;">%</span> <span style="color: #79c0ff;">+</span><span style="color: #79c0ff;">0.0465</span><span style="color: #79c0ff;">%</span> <span style="color: #79c0ff;">+</span><span style="color: #79c0ff;">0.3828</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">p</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0.79</span> <span style="color: #79c0ff;">&gt;</span> <span style="color: #79c0ff;">0.05</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="14">                        <span style="color: #ffa657;">No</span> <span style="color: #e6edf3;">change</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">performance</span> <span style="color: #e6edf3;">detected</span><span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="15"><span style="color: #ffa657;">Found</span> <span style="color: #79c0ff;">6</span> <span style="color: #e6edf3;">outliers</span> <span style="color: #e6edf3;">among</span> <span style="color: #79c0ff;">100</span> <span style="color: #79c0ff;">measurements</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">6.00</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="16">  <span style="color: #79c0ff;">5</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">5.00</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">high</span> <span style="color: #e6edf3;">mild</span>
</div><div class="line" data-line="17">  <span style="color: #79c0ff;">1</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">1.00</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">high</span> <span style="color: #e6edf3;">severe</span>
</div><div class="line" data-line="18">
</div><div class="line" data-line="19"><span style="color: #e6edf3;">transpose2</span>              <span style="color: #e6edf3;">time</span><span style="color: #e6edf3;">:</span>   <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">27.966</span> <span style="color: #e6edf3;">µs</span> <span style="color: #79c0ff;">28</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">068</span> <span style="color: #e6edf3;">µs</span> <span style="color: #79c0ff;">28</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">174</span> <span style="color: #e6edf3;">µs</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="20">                        <span style="color: #e6edf3;">change</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">-</span><span style="color: #79c0ff;">0.0055</span><span style="color: #79c0ff;">%</span> <span style="color: #79c0ff;">+</span><span style="color: #79c0ff;">0.3648</span><span style="color: #79c0ff;">%</span> <span style="color: #79c0ff;">+</span><span style="color: #79c0ff;">0.6945</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">p</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0.04</span> <span style="color: #79c0ff;">&lt;</span> <span style="color: #79c0ff;">0.05</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="21">                        <span style="color: #ffa657;">Change</span> <span style="color: #e6edf3;">within</span> <span style="color: #e6edf3;">noise</span> <span style="color: #e6edf3;">threshold</span><span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="22"><span style="color: #ffa657;">Found</span> <span style="color: #79c0ff;">1</span> <span style="color: #e6edf3;">outliers</span> <span style="color: #e6edf3;">among</span> <span style="color: #79c0ff;">100</span> <span style="color: #79c0ff;">measurements</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">1.00</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="23">  <span style="color: #79c0ff;">1</span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">1.00</span><span style="color: #79c0ff;">%</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">high</span> <span style="color: #e6edf3;">mild</span>
</div></code></pre>
<p>Based on this I opted for the transpose2 function.</p>
<p>As we have this out of the way time to create the function that produces the data that we can feed to Polars.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">use</span> <span style="color: #ff7b72;">polars</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #ff7b72;">prelude</span><span style="color: #e6edf3;">::</span><span style="color: #ffa657;">NamedFrom</span><span style="color: #e6edf3;">,</span> <span style="color: #ff7b72;">series</span><span style="color: #e6edf3;">::</span><span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #ff7b72;">pub</span><span style="color: #e6edf3;">(</span><span style="color: #ff7b72;">crate</span><span style="color: #e6edf3;">)</span> <span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">get_columns</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="4">    <span style="color: #8b949e;">// this creates a vec of Series from a 2D vec</span>
</div><div class="line" data-line="5">
</div><div class="line" data-line="6">    <span style="color: #79c0ff;">vec</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="7">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;date&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>                <span style="color: #8b949e;">// date</span>
</div><div class="line" data-line="8">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;time&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">1</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>                <span style="color: #8b949e;">// time</span>
</div><div class="line" data-line="9">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-edge-location&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">2</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>     <span style="color: #8b949e;">// x-edge-location</span>
</div><div class="line" data-line="10">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;sc-bytes&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">3</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>            <span style="color: #8b949e;">// sc-bytes</span>
</div><div class="line" data-line="11">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;c-ip&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">4</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>                <span style="color: #8b949e;">// c-ip</span>
</div><div class="line" data-line="12">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs-method&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">5</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>           <span style="color: #8b949e;">// cs-method</span>
</div><div class="line" data-line="13">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs(Host)&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">6</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>            <span style="color: #8b949e;">// cs(Host)</span>
</div><div class="line" data-line="14">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs-uri-stem&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">7</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>         <span style="color: #8b949e;">// cs-uri-stem</span>
</div><div class="line" data-line="15">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;sc-status&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">8</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>           <span style="color: #8b949e;">// sc-status</span>
</div><div class="line" data-line="16">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs(Referer)&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">9</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>         <span style="color: #8b949e;">// cs(Referer)</span>
</div><div class="line" data-line="17">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs(User-Agent)&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">10</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>     <span style="color: #8b949e;">// cs(User-Agent)</span>
</div><div class="line" data-line="18">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs-uri-query&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">11</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>       <span style="color: #8b949e;">// cs-uri-query</span>
</div><div class="line" data-line="19">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs(Cookie)&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">12</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>         <span style="color: #8b949e;">// cs(Cookie)</span>
</div><div class="line" data-line="20">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-edge-result-type&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">13</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #8b949e;">//  x-edge-result-type</span>
</div><div class="line" data-line="21">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-edge-request-id&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">14</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>  <span style="color: #8b949e;">// x-edge-request-id</span>
</div><div class="line" data-line="22">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-host-header&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">15</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>      <span style="color: #8b949e;">// x-host-header</span>
</div><div class="line" data-line="23">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs-protocol&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">16</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>        <span style="color: #8b949e;">// cs-protocol</span>
</div><div class="line" data-line="24">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs-bytes&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">17</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>           <span style="color: #8b949e;">// cs-bytes</span>
</div><div class="line" data-line="25">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;time-taken&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">18</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>         <span style="color: #8b949e;">// time-taken</span>
</div><div class="line" data-line="26">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-forwarded-for&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">19</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>    <span style="color: #8b949e;">// x-forwarded-for</span>
</div><div class="line" data-line="27">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;ssl-protocol&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">20</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>       <span style="color: #8b949e;">// ssl-protocol</span>
</div><div class="line" data-line="28">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;ssl-cipher&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">21</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>         <span style="color: #8b949e;">// ssl-cipher</span>
</div><div class="line" data-line="29">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-edge-response-result-type&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">22</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #8b949e;">// x-edge-response-result-type</span>
</div><div class="line" data-line="30">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;cs-protocol-version&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">23</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #8b949e;">// cs-protocol-version</span>
</div><div class="line" data-line="31">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;fle-status&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">24</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>         <span style="color: #8b949e;">// fle-status</span>
</div><div class="line" data-line="32">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;fle-encrypted-fields&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">25</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #8b949e;">// fle-encrypted-fields</span>
</div><div class="line" data-line="33">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;c-port&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">26</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>             <span style="color: #8b949e;">// c-port</span>
</div><div class="line" data-line="34">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;time-to-first-byte&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">27</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #8b949e;">// time-to-first-byte</span>
</div><div class="line" data-line="35">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-edge-detailed-result-type&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">28</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #8b949e;">// x-edge-detailed-result-type</span>
</div><div class="line" data-line="36">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;sc-content-type&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">29</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>    <span style="color: #8b949e;">// sc-content-type</span>
</div><div class="line" data-line="37">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;sc-content-len&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">30</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>     <span style="color: #8b949e;">// sc-content-len</span>
</div><div class="line" data-line="38">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;sc-range-start&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">31</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>     <span style="color: #8b949e;">// sc-range-start</span>
</div><div class="line" data-line="39">        <span style="color: #ffa657;">Series</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">new</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;sc-range-end&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #79c0ff;">&amp;</span><span style="color: #e6edf3;">transposed</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">32</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span>       <span style="color: #8b949e;">// sc-range-end</span>
</div><div class="line" data-line="40">    <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="41"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>There is only one step is remaining in the project: writing the Parquet files to disk.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">df</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">DataFrame</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">DataFrame</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">columns</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">file_name</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">String</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">format</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;par/&lbrace;&rbrace;.zstd.parq&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">date</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="3"><span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">file</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">File</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">File</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">create</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">file_name</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="4"><span style="color: #79c0ff;">Ok</span><span style="color: #e6edf3;">(</span><span style="color: #ffa657;">ParquetWriter</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">&amp;</span><span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">file</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">finish</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">&amp;</span><span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">df</span><span style="color: #e6edf3;">)</span><span style="color: #79c0ff;">?</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span>
</div></code></pre>
<p>This function writes the DataFrame to a Parquet file compressed with Zstandard.</p>
<h2 id="closing"><a href="#closing">Closing</a></h2>
<p>Rust is a powerful language for data engineering with modern features and great libraries. By combining Rust and Polars (and the underlying great libraries), we can efficiently process and analyze datasets like AWS CloudFront logs. Give Rust a try for your next data engineering project—you won’t be disappointed!</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 07 Feb 2025 13:45:01 +0100</pubDate>
    </item>
    <item>
      <title>Running .NET 8 web applications in Docker</title>
      <link>https://dev.l1x.be/posts/2024/02/17/running-.net-8-web-applications-in-docker/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2024/02/17/running-.net-8-web-applications-in-docker/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/airplane_container.webp" alt="Airplane Container" /></p>
<p>There are several compelling reasons to run a .NET 8 web application in Docker:</p>
<p>Portability and Consistency: Docker creates standardized environments, ensuring your application runs the same way regardless of the underlying operating system. This simplifies deployment across different platforms, cloud providers, and development machines. It is easy to target both mainstream CPU architectures of our times (ARM and X86).</p>
<p>Isolation and Resource Management: Each container runs in a self-contained space, preventing conflicts with other applications and ensuring each receives the necessary resources. This improves reliability and security.</p>
<p>Simplified Development and DevOps: Docker streamlines the development workflow by providing a consistent environment for developers and testers. Additionally, containerized .NET applications integrate seamlessly with DevOps tools and CI/CD pipelines, automating deployment and management.</p>
<h2 id="creating-a-simple-web-app"><a href="#creating-a-simple-web-app">Creating a simple web app</a></h2>
<p>I would like to use Linux for this because we are in the process of migrating from Windows to Linux at work and I wanted to have some exposure how hard is to set up the development environment.</p>
<ul>
<li>
<p>First step is to install the Micrososft repo that has the dotnet SDK. I use Debian for most Linux installations and the following works on Bookworm (x86). The special thing about my setup is the use of doas instead of sudo.</p>
</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://packages.microsoft.com/config/debian/12/packages-microsoft-prod.deb</span> <span style="color: #e6edf3;">-O</span> <span style="color: #e6edf3;">packages-microsoft-prod.deb</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">doas</span> <span style="color: #e6edf3;">apt</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">./packages-microsoft-prod.deb</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">rm</span> <span style="color: #e6edf3;">packages-microsoft-prod.deb</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">doas</span> <span style="color: #e6edf3;">apt</span> <span style="color: #e6edf3;">update</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">doas</span> <span style="color: #e6edf3;">apt</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">dotnet-sdk-8.0</span> <span style="color: #e6edf3;">-y</span>
</div></code></pre>
<p>Dotnet version can be verfied the following way:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@deby</span> <span style="color: #e6edf3;">~</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">dotnet</span> <span style="color: #e6edf3;">--version</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">8.0.200</span>
</div></code></pre>
<p>Great, we have the dotnet version we need.</p>
<ul>
<li>Creating a web application</li>
</ul>
<p>I opt out from the Microsoft data collection and try to figure out how to create a new project skeleton for a simple api.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@deby</span> <span style="color: #e6edf3;">~</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">export</span> <span style="color: #e6edf3;">DOTNET_CLI_TELEMETRY_OPTOUT=1</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@deby</span> <span style="color: #e6edf3;">~</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">dotnet</span> <span style="color: #e6edf3;">new</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">The</span> <span style="color: #a5d6ff;">&#39;dotnet new&#39;</span> <span style="color: #e6edf3;">command</span> <span style="color: #e6edf3;">creates</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">.NET</span> <span style="color: #e6edf3;">project</span> <span style="color: #e6edf3;">based</span> <span style="color: #e6edf3;">on</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">template.</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Common</span> <span style="color: #e6edf3;">templates</span> <span style="color: #e6edf3;">are:</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">Template</span> <span style="color: #e6edf3;">Name</span>   <span style="color: #e6edf3;">Short</span> <span style="color: #e6edf3;">Name</span>  <span style="color: #e6edf3;">Language</span>    <span style="color: #e6edf3;">Tags</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">--------------</span>  <span style="color: #e6edf3;">----------</span>  <span style="color: #e6edf3;">----------</span>  <span style="color: #e6edf3;">----------------------</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Blazor</span> <span style="color: #e6edf3;">Web</span> <span style="color: #e6edf3;">App</span>  <span style="color: #e6edf3;">blazor</span>      <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">C#</span><span style="color: #e6edf3;">]</span>        <span style="color: #e6edf3;">Web/Blazor/WebAssembly</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Class</span> <span style="color: #e6edf3;">Library</span>   <span style="color: #e6edf3;">classlib</span>    <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">C#</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,F#,VB</span>  <span style="color: #e6edf3;">Common/Library</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">Console</span> <span style="color: #e6edf3;">App</span>     <span style="color: #e6edf3;">console</span>     <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">C#</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,F#,VB</span>  <span style="color: #e6edf3;">Common/Console</span>
</div><div class="line" data-line="11">
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">An</span> <span style="color: #e6edf3;">example</span> <span style="color: #e6edf3;">would</span> <span style="color: #e6edf3;">be:</span>
</div><div class="line" data-line="13">   <span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">console</span>
</div><div class="line" data-line="14">
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">Display</span> <span style="color: #e6edf3;">template</span> <span style="color: #e6edf3;">options</span> <span style="color: #e6edf3;">with:</span>
</div><div class="line" data-line="16">   <span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">console</span> <span style="color: #e6edf3;">-h</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">Display</span> <span style="color: #e6edf3;">all</span> <span style="color: #e6edf3;">installed</span> <span style="color: #e6edf3;">templates</span> <span style="color: #e6edf3;">with:</span>
</div><div class="line" data-line="18">   <span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">list</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">Display</span> <span style="color: #e6edf3;">templates</span> <span style="color: #e6edf3;">available</span> <span style="color: #e6edf3;">on</span> <span style="color: #e6edf3;">NuGet.org</span> <span style="color: #e6edf3;">with:</span>
</div><div class="line" data-line="20">   <span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">search</span> <span style="color: #e6edf3;">web</span>
</div></code></pre>
<p>As it turns out creating a simple web api is really not that hard.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@deby</span> <span style="color: #e6edf3;">~/c/dotnet-test</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">webapi</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">The</span> <span style="color: #e6edf3;">template</span> <span style="color: #a5d6ff;">&quot;ASP.NET Core Web API&quot;</span> <span style="color: #e6edf3;">was</span> <span style="color: #e6edf3;">created</span> <span style="color: #e6edf3;">successfully.</span>
</div><div class="line" data-line="3">
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Processing</span> <span style="color: #e6edf3;">post-creation</span> <span style="color: #e6edf3;">actions...</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Restoring</span> <span style="color: #e6edf3;">/home/l1x/code/dotnet-test/dotnet-test.csproj:</span>
</div><div class="line" data-line="6">  <span style="color: #d2a8ff;">Determining</span> <span style="color: #e6edf3;">projects</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">restore...</span>
</div><div class="line" data-line="7">  <span style="color: #d2a8ff;">Restored</span> <span style="color: #e6edf3;">/home/l1x/code/dotnet-test/dotnet-test.csproj</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">in</span> <span style="color: #e6edf3;">21.68</span> <span style="color: #e6edf3;">sec</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span><span style="color: #d2a8ff;">.</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Restore</span> <span style="color: #e6edf3;">succeeded.</span>
</div></code></pre>
<p>The web application code is straightforward. The only thing I have added is the compression.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">using System.IO.Compression;
</div><div class="line" data-line="2">using Microsoft.AspNetCore.ResponseCompression;
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">var builder = WebApplication.CreateBuilder(args);
</div><div class="line" data-line="5">
</div><div class="line" data-line="6">builder.Services.AddEndpointsApiExplorer();
</div><div class="line" data-line="7">builder.Services.AddSwaggerGen();
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">builder.Services.AddResponseCompression(options =&gt; &lbrace;
</div><div class="line" data-line="10">  options.EnableForHttps = true;
</div><div class="line" data-line="11">  options.Providers.Add &lt; BrotliCompressionProvider &gt; ();
</div><div class="line" data-line="12">  options.Providers.Add &lt; GzipCompressionProvider &gt; ();
</div><div class="line" data-line="13">&rbrace;);
</div><div class="line" data-line="14">
</div><div class="line" data-line="15">var app = builder.Build();
</div><div class="line" data-line="16">
</div><div class="line" data-line="17">// if (app.Environment.IsDevelopment()) &lbrace;
</div><div class="line" data-line="18">app.UseSwagger();
</div><div class="line" data-line="19">app.UseSwaggerUI();
</div><div class="line" data-line="20">// &rbrace;
</div><div class="line" data-line="21">
</div><div class="line" data-line="22">app.UseResponseCompression();
</div><div class="line" data-line="23">app.UseHttpsRedirection();
</div><div class="line" data-line="24">
</div><div class="line" data-line="25">var summaries = new [] &lbrace;
</div><div class="line" data-line="26">  &quot;Freezing&quot;,
</div><div class="line" data-line="27">  &quot;Bracing&quot;,
</div><div class="line" data-line="28">  &quot;Chilly&quot;,
</div><div class="line" data-line="29">  &quot;Cool&quot;,
</div><div class="line" data-line="30">  &quot;Mild&quot;,
</div><div class="line" data-line="31">  &quot;Warm&quot;,
</div><div class="line" data-line="32">  &quot;Balmy&quot;,
</div><div class="line" data-line="33">  &quot;Hot&quot;,
</div><div class="line" data-line="34">  &quot;Sweltering&quot;,
</div><div class="line" data-line="35">  &quot;Scorching&quot;
</div><div class="line" data-line="36">&rbrace;;
</div><div class="line" data-line="37">
</div><div class="line" data-line="38">app.MapGet(&quot;/weatherforecast&quot;, () =&gt; &lbrace;
</div><div class="line" data-line="39">    var forecast = Enumerable.Range(1, 5).Select(index =&gt;
</div><div class="line" data-line="40">        new WeatherForecast(
</div><div class="line" data-line="41">          DateOnly.FromDateTime(DateTime.Now.AddDays(index)),
</div><div class="line" data-line="42">          Random.Shared.Next(-20, 55),
</div><div class="line" data-line="43">          summaries[Random.Shared.Next(summaries.Length)]
</div><div class="line" data-line="44">        ))
</div><div class="line" data-line="45">      .ToArray();
</div><div class="line" data-line="46">    return forecast;
</div><div class="line" data-line="47">  &rbrace;)
</div><div class="line" data-line="48">  .WithName(&quot;GetWeatherForecast&quot;)
</div><div class="line" data-line="49">  .WithOpenApi();
</div><div class="line" data-line="50">
</div><div class="line" data-line="51">app.Run();
</div><div class="line" data-line="52">
</div><div class="line" data-line="53">record WeatherForecast(DateOnly Date, int TemperatureC, string ? Summary) &lbrace;
</div><div class="line" data-line="54">  public int TemperatureF =&gt; 32 + (int)(TemperatureC / 0.5556);
</div><div class="line" data-line="55">&rbrace;
</div></code></pre>
<h2 id="creating-dockerfiles"><a href="#creating-dockerfiles">Creating Dockerfiles</a></h2>
<p>I am assuming that docker is already installed on the system. If not the following packages should be installed:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">containerd.io/bookworm,now 1.6.28-1 amd64 [installed,automatic]
</div><div class="line" data-line="2">docker-ce-cli/bookworm,now 5:25.0.3-1~debian.12~bookworm amd64 [installed]
</div><div class="line" data-line="3">docker-ce/bookworm,now 5:25.0.3-1~debian.12~bookworm amd64 [installed]
</div></code></pre>
<p>The first docker file is using the defaut images and stages for the build:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build-env
</div><div class="line" data-line="2">WORKDIR /app
</div><div class="line" data-line="3">COPY . ./
</div><div class="line" data-line="4">RUN dotnet restore
</div><div class="line" data-line="5">RUN dotnet publish -c Release -o out
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">FROM mcr.microsoft.com/dotnet/aspnet:8.0
</div><div class="line" data-line="8">WORKDIR /app
</div><div class="line" data-line="9">COPY --from=build-env /app/out .
</div><div class="line" data-line="10">ENTRYPOINT [&quot;dotnet&quot;, &quot;dotnet-test.dll&quot;]
</div></code></pre>
<p>Using this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@deby</span> <span style="color: #e6edf3;">~/c/dotnet-test</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">main</span><span style="color: #e6edf3;">)</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">dotnet-test:default</span> <span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">+</span>] Building 19.6s <span style="color: #e6edf3;">(</span>14/14<span style="color: #e6edf3;">)</span> <span style="color: #79c0ff;">FINISHED</span>                                                                                                                                 docker:default
</div><div class="line" data-line="3"> <span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">internal</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">load</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">definition</span> <span style="color: #e6edf3;">from</span> <span style="color: #e6edf3;">Dockerfile</span>                                                                                                                          <span style="color: #e6edf3;">0.2s</span>
</div><div class="line" data-line="4"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">transferring</span> <span style="color: #e6edf3;">dockerfile:</span> <span style="color: #e6edf3;">300B</span>                                                                                                                                          <span style="color: #e6edf3;">0.0s</span>
</div><div class="line" data-line="5"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">internal</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">load</span> <span style="color: #e6edf3;">metadata</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">mcr.microsoft.com/dotnet/aspnet:8.0</span>                                                                                                          <span style="color: #e6edf3;">0.0s</span>
</div><div class="line" data-line="6"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">internal</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">load</span> <span style="color: #e6edf3;">metadata</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">mcr.microsoft.com/dotnet/sdk:8.0</span>                                                                                                             <span style="color: #e6edf3;">0.0s</span>
</div><div class="line" data-line="7"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">internal</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">load</span> <span style="color: #e6edf3;">.dockerignore</span>                                                                                                                                             <span style="color: #e6edf3;">0.2s</span>
</div><div class="line" data-line="8"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">transferring</span> <span style="color: #e6edf3;">context:</span> <span style="color: #e6edf3;">2B</span>                                                                                                                                               <span style="color: #e6edf3;">0.0s</span>
</div><div class="line" data-line="9"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">build-env</span> <span style="color: #a5d6ff;">1/5</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">mcr.microsoft.com/dotnet/sdk:8.0</span>                                                                                                                     <span style="color: #e6edf3;">1.4s</span>
</div><div class="line" data-line="10"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">stage-1</span> <span style="color: #a5d6ff;">1/3</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">mcr.microsoft.com/dotnet/aspnet:8.0</span>                                                                                                                    <span style="color: #e6edf3;">0.8s</span>
</div><div class="line" data-line="11"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">internal</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">load</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">context</span>                                                                                                                                             <span style="color: #e6edf3;">0.7s</span>
</div><div class="line" data-line="12"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">transferring</span> <span style="color: #e6edf3;">context:</span> <span style="color: #e6edf3;">12.23MB</span>                                                                                                                                          <span style="color: #e6edf3;">0.2s</span>
</div><div class="line" data-line="13"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">stage-1</span> <span style="color: #a5d6ff;">2/3</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">WORKDIR</span> <span style="color: #e6edf3;">/app</span>                                                                                                                                                <span style="color: #e6edf3;">0.5s</span>
</div><div class="line" data-line="14"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">build-env</span> <span style="color: #a5d6ff;">2/5</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">WORKDIR</span> <span style="color: #e6edf3;">/app</span>                                                                                                                                              <span style="color: #e6edf3;">0.2s</span>
</div><div class="line" data-line="15"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">build-env</span> <span style="color: #a5d6ff;">3/5</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">COPY</span> <span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">./</span>                                                                                                                                                 <span style="color: #e6edf3;">0.2s</span>
</div><div class="line" data-line="16"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">build-env</span> <span style="color: #a5d6ff;">4/5</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">dotnet</span> <span style="color: #e6edf3;">restore</span>                                                                                                                                        <span style="color: #e6edf3;">8.3s</span>
</div><div class="line" data-line="17"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">build-env</span> <span style="color: #a5d6ff;">5/5</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">dotnet</span> <span style="color: #e6edf3;">publish</span> <span style="color: #e6edf3;">-c</span> <span style="color: #e6edf3;">Release</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">out</span>                                                                                                                      <span style="color: #e6edf3;">8.1s</span>
</div><div class="line" data-line="18"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">stage-1</span> <span style="color: #a5d6ff;">3/3</span><span style="color: #a5d6ff;">]</span> <span style="color: #e6edf3;">COPY</span> <span style="color: #e6edf3;">--from=build-env</span> <span style="color: #e6edf3;">/app/out</span> <span style="color: #e6edf3;">.</span>                                                                                                                            <span style="color: #e6edf3;">0.3s</span>
</div><div class="line" data-line="19"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">exporting</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">image</span>                                                                                                                                                        <span style="color: #e6edf3;">0.3s</span>
</div><div class="line" data-line="20"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">exporting</span> <span style="color: #e6edf3;">layers</span>                                                                                                                                                       <span style="color: #e6edf3;">0.3s</span>
</div><div class="line" data-line="21"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">writing</span> <span style="color: #e6edf3;">image</span> <span style="color: #e6edf3;">sha256:265648afb3c8fecba0a1c22ad196b552096819eeccf4444d0ec0f7adec678b31</span>                                                                                  <span style="color: #e6edf3;">0.0s</span>
</div><div class="line" data-line="22"> <span style="color: #d2a8ff;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">=</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">naming</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">docker.io/library/dotnet-test:default</span>                                                                                                                        <span style="color: #e6edf3;">0.0s</span>
</div></code></pre>
<p>We have the new buildx in docker but it seems the dangling images are still a thing. I guess these are required for caching.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@deby</span> <span style="color: #e6edf3;">~/c/dotnet-test</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">main</span><span style="color: #e6edf3;">)</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">images</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">REPOSITORY</span>                        <span style="color: #e6edf3;">TAG</span>                  <span style="color: #e6edf3;">IMAGE</span> <span style="color: #e6edf3;">ID</span>       <span style="color: #e6edf3;">CREATED</span>          <span style="color: #e6edf3;">SIZE</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">dotnet-test</span>                       <span style="color: #e6edf3;">default</span>              <span style="color: #e6edf3;">265648afb3c8</span>   <span style="color: #79c0ff;">28</span> <span style="color: #e6edf3;">seconds</span> <span style="color: #e6edf3;">ago</span>   <span style="color: #e6edf3;">221MB</span>
</div><div class="line" data-line="4"><span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>                            <span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>               <span style="color: #e6edf3;">cf116155bb30</span>   <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">minutes</span> <span style="color: #e6edf3;">ago</span>    <span style="color: #e6edf3;">221MB</span>
</div></code></pre>
<p>Having a 221MB container is a not that bad.</p>
<p>Cleaning up the dangling images is easy:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">rmi</span> <span style="color: #e6edf3;">$(</span><span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">images</span> <span style="color: #e6edf3;">-f</span> <span style="color: #a5d6ff;">&quot;dangling=true&quot;</span> <span style="color: #e6edf3;">-q</span><span style="color: #e6edf3;">)</span>
</div></code></pre>
<p>There are three versions of base images that I would like to use:</p>
<ul>
<li>Debian (this is the default)</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build-env
</div><div class="line" data-line="2">WORKDIR /app
</div><div class="line" data-line="3">COPY . ./
</div><div class="line" data-line="4">RUN dotnet restore
</div><div class="line" data-line="5">RUN dotnet publish -c Release -o out
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">FROM mcr.microsoft.com/dotnet/aspnet:8.0-bookworm-slim
</div><div class="line" data-line="8">WORKDIR /app
</div><div class="line" data-line="9">COPY --from=build-env /app/out .
</div><div class="line" data-line="10">ENTRYPOINT [&quot;dotnet&quot;, &quot;dotnet-test.dll&quot;]
</div></code></pre>
<ul>
<li>Alpine 3.18</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build-env
</div><div class="line" data-line="2">WORKDIR /app
</div><div class="line" data-line="3">COPY . ./
</div><div class="line" data-line="4">RUN dotnet restore
</div><div class="line" data-line="5">RUN dotnet publish -c Release -o out
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">FROM mcr.microsoft.com/dotnet/aspnet:8.0-alpine3.18
</div><div class="line" data-line="8">WORKDIR /app
</div><div class="line" data-line="9">COPY --from=build-env /app/out .
</div><div class="line" data-line="10">ENTRYPOINT [&quot;dotnet&quot;, &quot;dotnet-test.dll&quot;]
</div></code></pre>
<ul>
<li>Alpine 3.19</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build-env
</div><div class="line" data-line="2">WORKDIR /app
</div><div class="line" data-line="3">COPY . ./
</div><div class="line" data-line="4">RUN dotnet restore
</div><div class="line" data-line="5">RUN dotnet publish -c Release -o out
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">FROM mcr.microsoft.com/dotnet/aspnet:8.0-alpine3.19
</div><div class="line" data-line="8">WORKDIR /app
</div><div class="line" data-line="9">COPY --from=build-env /app/out .
</div><div class="line" data-line="10">ENTRYPOINT [&quot;dotnet&quot;, &quot;dotnet-test.dll&quot;]
</div></code></pre>
<p>I would like to build all of these at one. For the CI/CD there are going to be separate steps.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">cat</span> <span style="color: #e6edf3;">build.all.sh</span>
</div><div class="line" data-line="2"><span style="color: #8b949e;">#!/usr/bin/env bash</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">dotnet-test:net8.alpine3.18</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">Dockerfile.net8.alpine3.18</span> <span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">dotnet-test:net8.alpine3.19</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">Dockerfile.net8.alpine3.19</span> <span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">dotnet-test:net8.bookworm-slim</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">Dockerfile.net8.bookworm-slim</span> <span style="color: #e6edf3;">.</span>
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">images</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">REPOSITORY</span>                        <span style="color: #e6edf3;">TAG</span>                  <span style="color: #e6edf3;">IMAGE</span> <span style="color: #e6edf3;">ID</span>       <span style="color: #e6edf3;">CREATED</span>          <span style="color: #e6edf3;">SIZE</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">dotnet-test</span>                       <span style="color: #e6edf3;">net8.alpine3.19</span>      <span style="color: #e6edf3;">8730308faf31</span>   <span style="color: #79c0ff;">54</span> <span style="color: #e6edf3;">seconds</span> <span style="color: #e6edf3;">ago</span>   <span style="color: #e6edf3;">111MB</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">dotnet-test</span>                       <span style="color: #e6edf3;">net8.alpine3.18</span>      <span style="color: #e6edf3;">694a028c16e6</span>   <span style="color: #79c0ff;">55</span> <span style="color: #e6edf3;">seconds</span> <span style="color: #e6edf3;">ago</span>   <span style="color: #e6edf3;">110MB</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">dotnet-test</span>                       <span style="color: #e6edf3;">default</span>              <span style="color: #e6edf3;">265648afb3c8</span>   <span style="color: #79c0ff;">9</span> <span style="color: #e6edf3;">minutes</span> <span style="color: #e6edf3;">ago</span>    <span style="color: #e6edf3;">221MB</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">dotnet-test</span>                       <span style="color: #e6edf3;">net8.bookworm-slim</span>   <span style="color: #e6edf3;">265648afb3c8</span>   <span style="color: #79c0ff;">9</span> <span style="color: #e6edf3;">minutes</span> <span style="color: #e6edf3;">ago</span>    <span style="color: #e6edf3;">221MB</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">mcr.microsoft.com/dotnet/sdk</span>      <span style="color: #e6edf3;">8.0</span>                  <span style="color: #e6edf3;">54ed1faefb92</span>   <span style="color: #79c0ff;">3</span> <span style="color: #e6edf3;">days</span> <span style="color: #e6edf3;">ago</span>       <span style="color: #e6edf3;">866MB</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">mcr.microsoft.com/dotnet/aspnet</span>   <span style="color: #e6edf3;">8.0</span>                  <span style="color: #e6edf3;">6eedb7553b12</span>   <span style="color: #79c0ff;">6</span> <span style="color: #e6edf3;">days</span> <span style="color: #e6edf3;">ago</span>       <span style="color: #e6edf3;">217MB</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">mcr.microsoft.com/dotnet/aspnet</span>   <span style="color: #e6edf3;">8.0-bookworm-slim</span>    <span style="color: #e6edf3;">6eedb7553b12</span>   <span style="color: #79c0ff;">6</span> <span style="color: #e6edf3;">days</span> <span style="color: #e6edf3;">ago</span>       <span style="color: #e6edf3;">217MB</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">mcr.microsoft.com/dotnet/aspnet</span>   <span style="color: #e6edf3;">8.0-alpine3.18</span>       <span style="color: #e6edf3;">20dce08103c1</span>   <span style="color: #79c0ff;">6</span> <span style="color: #e6edf3;">days</span> <span style="color: #e6edf3;">ago</span>       <span style="color: #e6edf3;">106MB</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">mcr.microsoft.com/dotnet/aspnet</span>   <span style="color: #e6edf3;">8.0-alpine3.19</span>       <span style="color: #e6edf3;">baa6902e9e30</span>   <span style="color: #79c0ff;">6</span> <span style="color: #e6edf3;">days</span> <span style="color: #e6edf3;">ago</span>       <span style="color: #e6edf3;">107MB</span>
</div></code></pre>
<p>It is time for a little performance check.</p>
<h2 id="simple-performance-check"><a href="#simple-performance-check">Simple performance check</a></h2>
<p>Using docker run with simple port mapping is the easiest to expose the service port and run the performance test. For this test I use a new Gravitron box (c7g.2xlarge, 8 CPU, 16G RAM) on AWS.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">ec2-user@ip-10-20-1-235</span> <span style="color: #e6edf3;">~</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">run</span> <span style="color: #e6edf3;">-p</span> <span style="color: #e6edf3;">8000:8000</span> <span style="color: #e6edf3;">-ti</span> <span style="color: #e6edf3;">dotnet-test:net8.alpine3.19</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">warn:</span> <span style="color: #e6edf3;">Microsoft.AspNetCore.Hosting.Diagnostics</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">15</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="3">      <span style="color: #e6edf3;">Overriding</span> <span style="color: #e6edf3;">HTTP_PORTS</span> <span style="color: #a5d6ff;">&#39;8080&#39;</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">HTTPS_PORTS</span> <span style="color: #a5d6ff;">&#39;&#39;</span><span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">Binding</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">values</span> <span style="color: #e6edf3;">defined</span> <span style="color: #e6edf3;">by</span> <span style="color: #e6edf3;">URLS</span> <span style="color: #e6edf3;">instead</span> <span style="color: #a5d6ff;">&#39;http://0.0.0.0:8000&#39;</span><span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">info:</span> <span style="color: #e6edf3;">Microsoft.Hosting.Lifetime</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">14</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="5">      <span style="color: #e6edf3;">Now</span> <span style="color: #e6edf3;">listening</span> <span style="color: #e6edf3;">on:</span> <span style="color: #e6edf3;">http://0.0.0.0:8000</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">info:</span> <span style="color: #e6edf3;">Microsoft.Hosting.Lifetime</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="7">      <span style="color: #e6edf3;">Application</span> <span style="color: #e6edf3;">started.</span> <span style="color: #e6edf3;">Press</span> <span style="color: #e6edf3;">Ctrl+C</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">shut</span> <span style="color: #e6edf3;">down.</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">info:</span> <span style="color: #e6edf3;">Microsoft.Hosting.Lifetime</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="9">      <span style="color: #e6edf3;">Hosting</span> <span style="color: #e6edf3;">environment:</span> <span style="color: #e6edf3;">Production</span>
</div><div class="line" data-line="10"><span style="color: #e6edf3;">info:</span> <span style="color: #e6edf3;">Microsoft.Hosting.Lifetime</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="11">      <span style="color: #e6edf3;">Content</span> <span style="color: #e6edf3;">root</span> <span style="color: #e6edf3;">path:</span> <span style="color: #e6edf3;">/app</span>
</div></code></pre>
<h3 id="results"><a href="#results">Results</a></h3>
<ul>
<li>Alpine 3.19</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">ec2-user@ip-10-20-1-235</span> <span style="color: #e6edf3;">~</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">drill</span> <span style="color: #e6edf3;">-q</span> <span style="color: #e6edf3;">--benchmark</span> <span style="color: #e6edf3;">benchmark.yml</span> <span style="color: #e6edf3;">--stats</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">Concurrency</span> <span style="color: #79c0ff;">3000</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">Iterations</span> <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">Rampup</span> <span style="color: #79c0ff;">2</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">Base</span> <span style="color: #e6edf3;">URL</span> <span style="color: #e6edf3;">http://localhost:8000</span>
</div><div class="line" data-line="6">
</div><div class="line" data-line="7"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Total</span> <span style="color: #e6edf3;">requests</span>            <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Successful</span> <span style="color: #e6edf3;">requests</span>       <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Failed</span> <span style="color: #e6edf3;">requests</span>           <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="10"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Median</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">request</span>   <span style="color: #e6edf3;">23ms</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Average</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">request</span>  <span style="color: #e6edf3;">25ms</span>
</div><div class="line" data-line="12"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Sample</span> <span style="color: #e6edf3;">standard</span> <span style="color: #e6edf3;">deviation</span> <span style="color: #e6edf3;">22ms</span>
</div><div class="line" data-line="13"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">99.0</span><span style="color: #a5d6ff;">&#39;th percentile        115ms</span>
</div><div class="line" data-line="14"><span style="color: #a5d6ff;">Fetch weather             99.5&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">134ms</span>
</div><div class="line" data-line="15"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">99.9</span><span style="color: #a5d6ff;">&#39;th percentile        424ms</span>
</div><div class="line" data-line="16"><span style="color: #a5d6ff;"></span>
</div><div class="line" data-line="17"><span style="color: #a5d6ff;">Time taken for tests      11.2 seconds</span>
</div><div class="line" data-line="18"><span style="color: #a5d6ff;">Total requests            500000</span>
</div><div class="line" data-line="19"><span style="color: #a5d6ff;">Successful requests       500000</span>
</div><div class="line" data-line="20"><span style="color: #a5d6ff;">Failed requests           0</span>
</div><div class="line" data-line="21"><span style="color: #a5d6ff;">Requests per second       44550.11 [#/sec]</span>
</div><div class="line" data-line="22"><span style="color: #a5d6ff;">Median time per request   23ms</span>
</div><div class="line" data-line="23"><span style="color: #a5d6ff;">Average time per request  25ms</span>
</div><div class="line" data-line="24"><span style="color: #a5d6ff;">Sample standard deviation 22ms</span>
</div><div class="line" data-line="25"><span style="color: #a5d6ff;">99.0&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">115ms</span>
</div><div class="line" data-line="26"><span style="color: #e6edf3;">99.5</span><span style="color: #a5d6ff;">&#39;th percentile        134ms</span>
</div><div class="line" data-line="27"><span style="color: #a5d6ff;">99.9&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">424ms</span>
</div></code></pre>
<ul>
<li>Alpine 3.18</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">ec2-user@ip-10-20-1-235</span> <span style="color: #e6edf3;">~</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">drill</span> <span style="color: #e6edf3;">-q</span> <span style="color: #e6edf3;">--benchmark</span> <span style="color: #e6edf3;">benchmark.yml</span> <span style="color: #e6edf3;">--stats</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">Concurrency</span> <span style="color: #79c0ff;">3000</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">Iterations</span> <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">Rampup</span> <span style="color: #79c0ff;">2</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">Base</span> <span style="color: #e6edf3;">URL</span> <span style="color: #e6edf3;">http://localhost:8000</span>
</div><div class="line" data-line="6">
</div><div class="line" data-line="7"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Total</span> <span style="color: #e6edf3;">requests</span>            <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Successful</span> <span style="color: #e6edf3;">requests</span>       <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Failed</span> <span style="color: #e6edf3;">requests</span>           <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="10"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Median</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">request</span>   <span style="color: #e6edf3;">24ms</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Average</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">request</span>  <span style="color: #e6edf3;">25ms</span>
</div><div class="line" data-line="12"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Sample</span> <span style="color: #e6edf3;">standard</span> <span style="color: #e6edf3;">deviation</span> <span style="color: #e6edf3;">23ms</span>
</div><div class="line" data-line="13"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">99.0</span><span style="color: #a5d6ff;">&#39;th percentile        115ms</span>
</div><div class="line" data-line="14"><span style="color: #a5d6ff;">Fetch weather             99.5&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">132ms</span>
</div><div class="line" data-line="15"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">99.9</span><span style="color: #a5d6ff;">&#39;th percentile        420ms</span>
</div><div class="line" data-line="16"><span style="color: #a5d6ff;"></span>
</div><div class="line" data-line="17"><span style="color: #a5d6ff;">Time taken for tests      11.3 seconds</span>
</div><div class="line" data-line="18"><span style="color: #a5d6ff;">Total requests            500000</span>
</div><div class="line" data-line="19"><span style="color: #a5d6ff;">Successful requests       500000</span>
</div><div class="line" data-line="20"><span style="color: #a5d6ff;">Failed requests           0</span>
</div><div class="line" data-line="21"><span style="color: #a5d6ff;">Requests per second       44315.59 [#/sec]</span>
</div><div class="line" data-line="22"><span style="color: #a5d6ff;">Median time per request   24ms</span>
</div><div class="line" data-line="23"><span style="color: #a5d6ff;">Average time per request  25ms</span>
</div><div class="line" data-line="24"><span style="color: #a5d6ff;">Sample standard deviation 23ms</span>
</div><div class="line" data-line="25"><span style="color: #a5d6ff;">99.0&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">115ms</span>
</div><div class="line" data-line="26"><span style="color: #e6edf3;">99.5</span><span style="color: #a5d6ff;">&#39;th percentile        132ms</span>
</div><div class="line" data-line="27"><span style="color: #a5d6ff;">99.9&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">420ms</span>
</div></code></pre>
<ul>
<li>Bookworm</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">ec2-user@ip-10-20-1-235</span> <span style="color: #e6edf3;">~</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">drill</span> <span style="color: #e6edf3;">-q</span> <span style="color: #e6edf3;">--benchmark</span> <span style="color: #e6edf3;">benchmark.yml</span> <span style="color: #e6edf3;">--stats</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">Concurrency</span> <span style="color: #79c0ff;">3000</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">Iterations</span> <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">Rampup</span> <span style="color: #79c0ff;">2</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">Base</span> <span style="color: #e6edf3;">URL</span> <span style="color: #e6edf3;">http://localhost:8000</span>
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">
</div><div class="line" data-line="8"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Total</span> <span style="color: #e6edf3;">requests</span>            <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Successful</span> <span style="color: #e6edf3;">requests</span>       <span style="color: #79c0ff;">500000</span>
</div><div class="line" data-line="10"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Failed</span> <span style="color: #e6edf3;">requests</span>           <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Median</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">request</span>   <span style="color: #e6edf3;">23ms</span>
</div><div class="line" data-line="12"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Average</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">request</span>  <span style="color: #e6edf3;">25ms</span>
</div><div class="line" data-line="13"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">Sample</span> <span style="color: #e6edf3;">standard</span> <span style="color: #e6edf3;">deviation</span> <span style="color: #e6edf3;">20ms</span>
</div><div class="line" data-line="14"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">99.0</span><span style="color: #a5d6ff;">&#39;th percentile        96ms</span>
</div><div class="line" data-line="15"><span style="color: #a5d6ff;">Fetch weather             99.5&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">116ms</span>
</div><div class="line" data-line="16"><span style="color: #e6edf3;">Fetch</span> <span style="color: #e6edf3;">weather</span>             <span style="color: #e6edf3;">99.9</span><span style="color: #a5d6ff;">&#39;th percentile        395ms</span>
</div><div class="line" data-line="17"><span style="color: #a5d6ff;"></span>
</div><div class="line" data-line="18"><span style="color: #a5d6ff;">Time taken for tests      11.1 seconds</span>
</div><div class="line" data-line="19"><span style="color: #a5d6ff;">Total requests            500000</span>
</div><div class="line" data-line="20"><span style="color: #a5d6ff;">Successful requests       500000</span>
</div><div class="line" data-line="21"><span style="color: #a5d6ff;">Failed requests           0</span>
</div><div class="line" data-line="22"><span style="color: #a5d6ff;">Requests per second       45201.00 [#/sec]</span>
</div><div class="line" data-line="23"><span style="color: #a5d6ff;">Median time per request   23ms</span>
</div><div class="line" data-line="24"><span style="color: #a5d6ff;">Average time per request  25ms</span>
</div><div class="line" data-line="25"><span style="color: #a5d6ff;">Sample standard deviation 20ms</span>
</div><div class="line" data-line="26"><span style="color: #a5d6ff;">99.0&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">96ms</span>
</div><div class="line" data-line="27"><span style="color: #e6edf3;">99.5</span><span style="color: #a5d6ff;">&#39;th percentile        116ms</span>
</div><div class="line" data-line="28"><span style="color: #a5d6ff;">99.9&#39;</span><span style="color: #e6edf3;">th</span> <span style="color: #e6edf3;">percentile</span>        <span style="color: #e6edf3;">395ms</span>
</div></code></pre>
<p>The difference is probably due to Musl / Glibc.</p>
<table>
<thead>
<tr>
<th></th>
<th>Alpine 3.19</th>
<th>Alpine 3.18</th>
<th>Bookworm</th>
</tr>
</thead>
<tbody>
<tr>
<td>[#/sec]</td>
<td>44550.11</td>
<td>44315.59</td>
<td>45201.00</td>
</tr>
</tbody>
</table>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sat, 17 Feb 2024 13:45:01 +0100</pubDate>
    </item>
    <item>
      <title>Running Bun as AWS Lambda function</title>
      <link>https://dev.l1x.be/posts/2023/10/19/running-bun-as-aws-lambda-function/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/10/19/running-bun-as-aws-lambda-function/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/bun.webp" alt="LLaMA" /></p>
<h2 id="what-is-bun"><a href="#what-is-bun">What is Bun?</a></h2>
<p>Bun is a toolkit for JavaScript and TypeScript applications, packaged as a single executable named &quot;bun.&quot; It acts as a runtime, manages packages, bundles code, and runs tests, ensuring a smooth and effective development workflow. Developed in Zig and powered by <a href="https://developer.apple.com/documentation/javascriptcore">JavaScriptCore</a>, it drastically cuts down startup times and memory consumption.</p>
<p>Three things stand out to me:</p>
<ol>
<li>
<p>All-in-One Solution: Bun provides a toolkit, including a runtime, package manager, bundler, and test runner. If you prefer having a cohesive and integrated development environment without the need to integrate multiple tools separately, Bun can be a convenient choice.</p>
</li>
<li>
<p>Performance Optimization: Bun is designed with a focus on performance. If you are working on projects where performance optimization is a critical concern, and you need a runtime and development environment that is tuned for speed, Bun might offer advantages over the default Node.js setup.</p>
</li>
<li>
<p>Simplified Development: Bun aims to simplify the development process, making it more straightforward and efficient. If you are looking for a tool that abstracts away complexities and provides a more streamlined workflow, Bun could be a suitable option.</p>
</li>
</ol>
<p>Let's continue with the installation and setup for the AWS Lambda environment.</p>
<h3 id="installing-bun"><a href="#installing-bun">Installing Bun</a></h3>
<p>For having the bun cli tool installed we can use Brew or something similar on different operating systems.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">brew</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">bun</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">==</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Downloading</span> <span style="color: #e6edf3;">https://formulae.brew.sh/api/formula.jws.json</span>
</div><div class="line" data-line="3"><span style="color: #8b949e;">################ 100.0%</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">==</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Downloading</span> <span style="color: #e6edf3;">https://formulae.brew.sh/api/cask.jws.json</span>
</div><div class="line" data-line="5"><span style="color: #8b949e;">################ 100.0%</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">==</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Fetching</span> <span style="color: #e6edf3;">oven-sh/bun/bun</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">==</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Summary</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">🍺</span>  <span style="color: #e6edf3;">/opt/homebrew/Cellar/bun/1.0.6:</span> <span style="color: #79c0ff;">7</span> <span style="color: #e6edf3;">files,</span> <span style="color: #e6edf3;">48.8MB,</span> <span style="color: #e6edf3;">built</span> <span style="color: #e6edf3;">in</span> <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">seconds</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">==</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> `<span style="color: #d2a8ff;">brew</span> <span style="color: #e6edf3;">cleanup</span> <span style="color: #e6edf3;">bun</span>`<span style="color: #a5d6ff;">...</span>
</div></code></pre>
<p>Running it without any parameter:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">bun</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Bun:</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">fast</span> <span style="color: #e6edf3;">JavaScript</span> <span style="color: #e6edf3;">runtime,</span> <span style="color: #e6edf3;">package</span> <span style="color: #e6edf3;">manager,</span> <span style="color: #e6edf3;">bundler</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">test</span> <span style="color: #e6edf3;">runner.</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">1.0.6</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">  <span style="color: #d2a8ff;">run</span>       <span style="color: #e6edf3;">./my-script.ts</span>       <span style="color: #e6edf3;">Run</span> <span style="color: #e6edf3;">JavaScript</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">Bun,</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">package.json</span> <span style="color: #e6edf3;">script,</span> <span style="color: #e6edf3;">or</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">bin</span>
</div><div class="line" data-line="5">  <span style="color: #d2a8ff;">test</span>                           <span style="color: #e6edf3;">Run</span> <span style="color: #e6edf3;">unit</span> <span style="color: #e6edf3;">tests</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">Bun</span>
</div><div class="line" data-line="6">  <span style="color: #d2a8ff;">x</span>         <span style="color: #e6edf3;">bun-repl</span>             <span style="color: #e6edf3;">Install</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">execute</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">package</span> <span style="color: #e6edf3;">bin</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">bunx</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="7">  <span style="color: #d2a8ff;">repl</span>                           <span style="color: #e6edf3;">Start</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">REPL</span> <span style="color: #e6edf3;">session</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">Bun</span>
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">  <span style="color: #d2a8ff;">init</span>                           <span style="color: #e6edf3;">Start</span> <span style="color: #e6edf3;">an</span> <span style="color: #e6edf3;">empty</span> <span style="color: #e6edf3;">Bun</span> <span style="color: #e6edf3;">project</span> <span style="color: #e6edf3;">from</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">blank</span> <span style="color: #e6edf3;">template</span>
</div><div class="line" data-line="10">  <span style="color: #d2a8ff;">create</span>    <span style="color: #e6edf3;">astro</span>                <span style="color: #e6edf3;">Create</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">project</span> <span style="color: #e6edf3;">from</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">template</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">c</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">  <span style="color: #d2a8ff;">install</span>                        <span style="color: #e6edf3;">Install</span> <span style="color: #e6edf3;">dependencies</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">package.json</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">i</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="13">  <span style="color: #d2a8ff;">add</span>       <span style="color: #e6edf3;">zod</span>                  <span style="color: #e6edf3;">Add</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">dependency</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">package.json</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">a</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="14">  <span style="color: #d2a8ff;">remove</span>    <span style="color: #e6edf3;">redux</span>                <span style="color: #e6edf3;">Remove</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">dependency</span> <span style="color: #e6edf3;">from</span> <span style="color: #e6edf3;">package.json</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">rm</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="15">  <span style="color: #d2a8ff;">update</span>    <span style="color: #e6edf3;">tailwindcss</span>          <span style="color: #e6edf3;">Update</span> <span style="color: #e6edf3;">outdated</span> <span style="color: #e6edf3;">dependencies</span>
</div><div class="line" data-line="16">  <span style="color: #d2a8ff;">link</span>                           <span style="color: #e6edf3;">Link</span> <span style="color: #e6edf3;">an</span> <span style="color: #e6edf3;">npm</span> <span style="color: #e6edf3;">package</span> <span style="color: #e6edf3;">globally</span>
</div><div class="line" data-line="17">  <span style="color: #d2a8ff;">unlink</span>                         <span style="color: #e6edf3;">Globally</span> <span style="color: #e6edf3;">unlink</span> <span style="color: #e6edf3;">an</span> <span style="color: #e6edf3;">npm</span> <span style="color: #e6edf3;">package</span>
</div><div class="line" data-line="18">  <span style="color: #d2a8ff;">pm</span>                             <span style="color: #e6edf3;">More</span> <span style="color: #e6edf3;">commands</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">managing</span> <span style="color: #e6edf3;">packages</span>
</div><div class="line" data-line="19">
</div><div class="line" data-line="20">  <span style="color: #d2a8ff;">build</span>     <span style="color: #e6edf3;">./a.ts</span> <span style="color: #e6edf3;">./b.jsx</span>       <span style="color: #e6edf3;">Bundle</span> <span style="color: #e6edf3;">TypeScript</span> <span style="color: #e6edf3;">&amp;</span> <span style="color: #d2a8ff;">JavaScript</span> <span style="color: #e6edf3;">into</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">single</span> <span style="color: #e6edf3;">file</span>
</div><div class="line" data-line="21">
</div><div class="line" data-line="22">  <span style="color: #d2a8ff;">upgrade</span>                        <span style="color: #e6edf3;">Get</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">latest</span> <span style="color: #e6edf3;">version</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">Bun</span>
</div><div class="line" data-line="23">  <span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">--help</span>                     <span style="color: #e6edf3;">Show</span> <span style="color: #e6edf3;">all</span> <span style="color: #e6edf3;">supported</span> <span style="color: #e6edf3;">flags</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">commands</span>
</div><div class="line" data-line="24">
</div><div class="line" data-line="25">  <span style="color: #d2a8ff;">Learn</span> <span style="color: #e6edf3;">more</span> <span style="color: #e6edf3;">about</span> <span style="color: #e6edf3;">Bun:</span>          <span style="color: #e6edf3;">https://bun.sh/docs</span>
</div><div class="line" data-line="26">  <span style="color: #d2a8ff;">Join</span> <span style="color: #e6edf3;">our</span> <span style="color: #e6edf3;">Discord</span> <span style="color: #e6edf3;">community:</span>    <span style="color: #e6edf3;">https://bun.sh/discord</span>
</div></code></pre>
<h3 id="getting-bun-source-code"><a href="#getting-bun-source-code">Getting Bun source code</a></h3>
<p>This step is needed to build a Lambda layer.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">git</span> <span style="color: #e6edf3;">clone</span> <span style="color: #e6edf3;">git@github.com:oven-sh/bun.git</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Cloning</span> <span style="color: #e6edf3;">into</span> <span style="color: #a5d6ff;">&#39;bun&#39;</span><span style="color: #e6edf3;">...</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">cd</span> <span style="color: #e6edf3;">bun/</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">git</span> <span style="color: #e6edf3;">checkout</span> <span style="color: #e6edf3;">bun-v1.0.6</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Note:</span> <span style="color: #e6edf3;">switching</span> <span style="color: #e6edf3;">to</span> <span style="color: #a5d6ff;">&#39;bun-v1.0.6&#39;</span><span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">HEAD</span> <span style="color: #e6edf3;">is</span> <span style="color: #e6edf3;">now</span> <span style="color: #e6edf3;">at</span> <span style="color: #e6edf3;">969da088f</span> <span style="color: #e6edf3;">fix</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">install</span><span style="color: #e6edf3;">)</span>: <span style="color: #d2a8ff;">re-evaluate</span> <span style="color: #e6edf3;">overrides</span> <span style="color: #e6edf3;">when</span> <span style="color: #e6edf3;">removed</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">969da08</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">via</span> <span style="color: #e6edf3;">🍞</span> <span style="color: #e6edf3;">v1.0.1</span> <span style="color: #e6edf3;">via</span> <span style="color: #e6edf3;">↯</span> <span style="color: #e6edf3;">v0.10.1</span>
</div></code></pre>
<h3 id="installing-lambda-dependencies"><a href="#installing-lambda-dependencies">Installing lambda dependencies</a></h3>
<p>There are a few dependencies that need to be installed before we can create a Lambda layer.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">cd</span> <span style="color: #e6edf3;">packages/bun-lambda</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">bun</span> <span style="color: #e6edf3;">install</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">v1.0.6</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">969da088</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="5"> <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">@oclif/plugin-plugins@3.9.1</span>
</div><div class="line" data-line="6"> <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">bun-types@0.7.3</span>
</div><div class="line" data-line="7"> <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">jszip@3.10.1</span>
</div><div class="line" data-line="8"> <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">oclif@3.6.5</span>
</div><div class="line" data-line="9"> <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">prettier@2.8.4</span>
</div><div class="line" data-line="10"> <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">aws4fetch@1.0.17</span>
</div><div class="line" data-line="11">
</div><div class="line" data-line="12"> <span style="color: #79c0ff;">686</span> <span style="color: #e6edf3;">packages</span> <span style="color: #e6edf3;">installed</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">6.68s</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="13">
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">bun</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">@oclif/plugin-plugins</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">bun</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">v1.0.6</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">969da088</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="16">
</div><div class="line" data-line="17"> <span style="color: #d2a8ff;">installed</span> <span style="color: #e6edf3;">@oclif/plugin-plugins@3.9.1</span>
</div><div class="line" data-line="18">
</div><div class="line" data-line="19"> <span style="color: #79c0ff;">4</span> <span style="color: #e6edf3;">packages</span> <span style="color: #e6edf3;">installed</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">110.00ms</span><span style="color: #e6edf3;">]</span>
</div></code></pre>
<h3 id="building-and-publishing-the-layer"><a href="#building-and-publishing-the-layer">Building and publishing the layer</a></h3>
<p>With the dependencies installed we can create the Lambda layer for arm64.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">bun</span> <span style="color: #e6edf3;">run</span> <span style="color: #e6edf3;">build-layer</span> <span style="color: #e6edf3;">--</span> \
</div><div class="line" data-line="2">        <span style="color: #e6edf3;">--arch</span> <span style="color: #e6edf3;">aarch64</span>   \
</div><div class="line" data-line="3">        <span style="color: #e6edf3;">--release</span> <span style="color: #e6edf3;">latest</span> \
</div><div class="line" data-line="4">        <span style="color: #e6edf3;">--output</span> <span style="color: #e6edf3;">./bun-lambda-layer.zip</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">bun</span> <span style="color: #e6edf3;">scripts/build-layer.ts</span> <span style="color: #e6edf3;">--arch</span> <span style="color: #e6edf3;">aarch64</span> <span style="color: #e6edf3;">--release</span> <span style="color: #e6edf3;">latest</span> <span style="color: #e6edf3;">--output</span> <span style="color: #e6edf3;">./bun-lambda-layer.zip</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">Downloading...</span> <span style="color: #e6edf3;">https://bun.sh/download/latest/linux/aarch64?avx2=true</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">Extracting...</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Saving...</span> <span style="color: #e6edf3;">./bun-lambda-layer.zip</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Saved</span>
</div></code></pre>
<p>Publishing the layer is also very simple.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">bun</span> <span style="color: #e6edf3;">run</span> <span style="color: #e6edf3;">publish-layer</span> <span style="color: #e6edf3;">--</span>         \
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">--layer</span> <span style="color: #e6edf3;">bun-lambda-layer</span>         \
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">--arch</span> <span style="color: #e6edf3;">aarch64</span>                   \
</div><div class="line" data-line="4">  <span style="color: #e6edf3;">--release</span> <span style="color: #e6edf3;">latest</span>                 \
</div><div class="line" data-line="5">  <span style="color: #e6edf3;">--output</span> <span style="color: #e6edf3;">./bun-lambda-layer.zip</span>  \
</div><div class="line" data-line="6">  <span style="color: #e6edf3;">--region</span> <span style="color: #e6edf3;">eu-west-1</span>
</div><div class="line" data-line="7">
</div><div class="line" data-line="8"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">bun</span> <span style="color: #e6edf3;">scripts/publish-layer.ts</span> <span style="color: #e6edf3;">--layer</span> <span style="color: #e6edf3;">bun-lambda-layer</span> <span style="color: #e6edf3;">--arch</span> <span style="color: #e6edf3;">aarch64</span> <span style="color: #e6edf3;">--release</span> <span style="color: #e6edf3;">latest</span> <span style="color: #e6edf3;">--output</span> <span style="color: #e6edf3;">./bun-lambda-layer.zip</span> <span style="color: #e6edf3;">--region</span> <span style="color: #e6edf3;">eu-west-1</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Downloading...</span> <span style="color: #e6edf3;">https://bun.sh/download/latest/linux/aarch64?avx2=true</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">Extracting...</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">Saving...</span> <span style="color: #e6edf3;">./bun-lambda-layer.zip</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">Saved</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">Publishing...</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">Published</span> <span style="color: #e6edf3;">arn:aws:lambda:eu-west-1:651831719661:layer:bun-lambda-layer:1</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">Done</span>
</div></code></pre>
<p>Checking the layer:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">aws</span> <span style="color: #e6edf3;">lambda</span> <span style="color: #e6edf3;">list-layers</span> 
</div><div class="line" data-line="2"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">jq</span> <span style="color: #a5d6ff;">&#39;.Layers[].LayerArn&#39;</span> 
</div><div class="line" data-line="3"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">xargs</span> <span style="color: #e6edf3;">-I</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">&rbrace;</span> <span style="color: #e6edf3;">bash</span> <span style="color: #e6edf3;">-c</span> <span style="color: #a5d6ff;">&quot;aws lambda list-layer-versions --layer-name &lbrace;&rbrace; | jq &#39;.LayerVersions[].LayerVersionArn&#39;&quot;</span> 
</div><div class="line" data-line="4"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">egrep</span> <span style="color: #e6edf3;">bun</span>
</div><div class="line" data-line="5"><span style="color: #a5d6ff;">&quot;arn:aws:lambda:eu-west-1:651831719661:layer:bun-lambda-layer:1&quot;</span>
</div></code></pre>
<h2 id="creating-lambda-execution-role"><a href="#creating-lambda-execution-role">Creating Lambda execution role</a></h2>
<p>First we need a file with the assume role policy document.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;Version&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;2012-10-17&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">&quot;Statement&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="4">    <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="5">      <span style="color: #79c0ff;">&quot;Effect&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Allow&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">      <span style="color: #79c0ff;">&quot;Principal&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="7">        <span style="color: #79c0ff;">&quot;Service&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;lambda.amazonaws.com&quot;</span>
</div><div class="line" data-line="8">      <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">      <span style="color: #79c0ff;">&quot;Action&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;sts:AssumeRole&quot;</span>
</div><div class="line" data-line="10">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="11">  <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="12"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>We can create the role called bun-lambda-role referencing the assume role policy.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">aws</span> <span style="color: #e6edf3;">iam</span> <span style="color: #e6edf3;">create-role</span> <span style="color: #e6edf3;">--role-name</span> <span style="color: #e6edf3;">bun-lambda-role</span> <span style="color: #e6edf3;">--assume-role-policy-document</span> <span style="color: #e6edf3;">file://role.policy.json</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">    <span style="color: #a5d6ff;">&quot;Role&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="4">        <span style="color: #a5d6ff;">&quot;Path&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;/&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">        <span style="color: #a5d6ff;">&quot;RoleName&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;bun-lambda-role&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">        <span style="color: #a5d6ff;">&quot;RoleId&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;AROAZPRBSW3W7DL5YG6HZ&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="7">        <span style="color: #a5d6ff;">&quot;Arn&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:iam::651831719661:role/bun-lambda-role&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">        <span style="color: #a5d6ff;">&quot;CreateDate&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;2023-10-19T10:16:11+00:00&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">        <span style="color: #a5d6ff;">&quot;AssumeRolePolicyDocument&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="10">            <span style="color: #a5d6ff;">&quot;Version&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;2012-10-17&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11">            <span style="color: #a5d6ff;">&quot;Statement&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="12">                <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="13">                    <span style="color: #a5d6ff;">&quot;Effect&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;Allow&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="14">                    <span style="color: #a5d6ff;">&quot;Principal&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="15">                        <span style="color: #a5d6ff;">&quot;Service&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;lambda.amazonaws.com&quot;</span>
</div><div class="line" data-line="16">                    <span style="color: #e6edf3;">&rbrace;</span>,
</div><div class="line" data-line="17">                    <span style="color: #a5d6ff;">&quot;Action&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;sts:AssumeRole&quot;</span>
</div><div class="line" data-line="18">                <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="19">            <span style="color: #d2a8ff;">]</span>
</div><div class="line" data-line="20">        <span style="color: #d2a8ff;">&rbrace;</span>
</div><div class="line" data-line="21">    <span style="color: #d2a8ff;">&rbrace;</span>
</div><div class="line" data-line="22"><span style="color: #d2a8ff;">&rbrace;</span>
</div></code></pre>
<h3 id="creating-the-lambda-function"><a href="#creating-the-lambda-function">Creating the Lambda function</a></h3>
<p>Creating the lambda function that handles the requests.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-typescript" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">export</span> <span style="color: #ff7b72;">default</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #ff7b72;">async</span> <span style="color: #d2a8ff;">handler</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">request</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Request</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Promise</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Response</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">    <span style="color: #79c0ff;">console</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">log</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">request</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">headers</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">get</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;x-amzn-function-arn&quot;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="4">    <span style="color: #ff7b72;">return</span> <span style="color: #ff7b72;">new</span> <span style="color: #ffa657;">Response</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">JSON</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">stringify</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;Hello&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;from Bun runtime&quot;</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="5">      <span style="color: #79c0ff;">status</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">200</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">      <span style="color: #79c0ff;">headers</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="7">        <span style="color: #a5d6ff;">&quot;Content-Type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;application/json&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">      <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="10">  <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">;</span>
</div></code></pre>
<p>Saving this as index.ts and then creating a zip file.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">zip</span> <span style="color: #e6edf3;">bun-function.zip</span> <span style="color: #e6edf3;">index.ts</span>
</div></code></pre>
<p>Finally, we can create the function:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">aws</span> <span style="color: #e6edf3;">lambda</span> <span style="color: #e6edf3;">create-function</span> \
</div><div class="line" data-line="2">          <span style="color: #e6edf3;">--function-name</span> <span style="color: #e6edf3;">bun-lambda-function</span> \
</div><div class="line" data-line="3">          <span style="color: #e6edf3;">--runtime</span> <span style="color: #e6edf3;">provided.al2</span> \
</div><div class="line" data-line="4">          <span style="color: #e6edf3;">--architectures</span> <span style="color: #e6edf3;">arm64</span> \
</div><div class="line" data-line="5">          <span style="color: #e6edf3;">--layers</span> <span style="color: #e6edf3;">arn:aws:lambda:eu-west-1:651831719661:layer:bun-layer:1</span> \
</div><div class="line" data-line="6">          <span style="color: #e6edf3;">--zip-file</span> <span style="color: #e6edf3;">fileb://bun-function.zip</span> \
</div><div class="line" data-line="7">          <span style="color: #e6edf3;">--handler</span> <span style="color: #e6edf3;">index.handler</span> \
</div><div class="line" data-line="8">          <span style="color: #e6edf3;">--role</span> <span style="color: #e6edf3;">arn:aws:iam::651831719661:role/bun-lambda-role</span>
</div></code></pre>
<p>The response from the AWS API has all the details of the lambda function we just created:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">    <span style="color: #79c0ff;">&quot;FunctionName&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;bun-lambda-function&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">    <span style="color: #79c0ff;">&quot;FunctionArn&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:lambda:eu-west-1:651831719661:function:bun-lambda-function&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">    <span style="color: #79c0ff;">&quot;Runtime&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;provided.al2&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">    <span style="color: #79c0ff;">&quot;Role&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:iam::651831719661:role/bun-lambda-role&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">    <span style="color: #79c0ff;">&quot;Handler&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;index.handler&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="7">    <span style="color: #79c0ff;">&quot;CodeSize&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">365</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">    <span style="color: #79c0ff;">&quot;Description&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">    <span style="color: #79c0ff;">&quot;Timeout&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">3</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="10">    <span style="color: #79c0ff;">&quot;MemorySize&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">128</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11">    <span style="color: #79c0ff;">&quot;LastModified&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;2023-10-19T10:16:58.106+0000&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="12">    <span style="color: #79c0ff;">&quot;CodeSha256&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;yNUNBaGkJuist/ZoQkZtnToNNXIU/taYQHqZrwCu3Mk=&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="13">    <span style="color: #79c0ff;">&quot;Version&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;$LATEST&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="14">    <span style="color: #79c0ff;">&quot;TracingConfig&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="15">        <span style="color: #79c0ff;">&quot;Mode&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;PassThrough&quot;</span>
</div><div class="line" data-line="16">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="17">    <span style="color: #79c0ff;">&quot;RevisionId&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;c8edb420-aeeb-4568-8289-ea7b0b845d00&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="18">    <span style="color: #79c0ff;">&quot;Layers&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="19">        <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="20">            <span style="color: #79c0ff;">&quot;Arn&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:lambda:eu-west-1:651831719661:layer:bun-layer:1&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="21">            <span style="color: #79c0ff;">&quot;CodeSize&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">33260946</span>
</div><div class="line" data-line="22">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="23">    <span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="24">    <span style="color: #79c0ff;">&quot;State&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Pending&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="25">    <span style="color: #79c0ff;">&quot;StateReason&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The function is being created.&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="26">    <span style="color: #79c0ff;">&quot;StateReasonCode&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Creating&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="27">    <span style="color: #79c0ff;">&quot;PackageType&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Zip&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="28">    <span style="color: #79c0ff;">&quot;Architectures&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="29">        <span style="color: #a5d6ff;">&quot;arm64&quot;</span>
</div><div class="line" data-line="30">    <span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="31">    <span style="color: #79c0ff;">&quot;EphemeralStorage&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="32">        <span style="color: #79c0ff;">&quot;Size&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">512</span>
</div><div class="line" data-line="33">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="34">    <span style="color: #79c0ff;">&quot;SnapStart&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="35">        <span style="color: #79c0ff;">&quot;ApplyOn&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;None&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="36">        <span style="color: #79c0ff;">&quot;OptimizationStatus&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Off&quot;</span>
</div><div class="line" data-line="37">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="38">    <span style="color: #79c0ff;">&quot;RuntimeVersionConfig&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="39">        <span style="color: #79c0ff;">&quot;RuntimeVersionArn&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:lambda:eu-west-1::runtime:dce29199fb5887a2c4fceaa2f34d395ba43a74a6895b381cb9383b1c7f3b5875&quot;</span>
</div><div class="line" data-line="40">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="41"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>To make the Lambda function accessible without any additional services we can enable function URL.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">aws</span> <span style="color: #e6edf3;">lambda</span> <span style="color: #e6edf3;">create-function-url-config</span> \
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">--function-name</span> <span style="color: #e6edf3;">bun-lambda-function</span> \
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">--auth-type</span> <span style="color: #e6edf3;">NONE</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="5">    <span style="color: #a5d6ff;">&quot;FunctionUrl&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;https://wbjmhbkwx5ikumwxkhka4lcgle0nvjgs.lambda-url.eu-west-1.on.aws/&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">    <span style="color: #a5d6ff;">&quot;FunctionArn&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:lambda:eu-west-1:651831719661:function:bun-lambda-function&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="7">    <span style="color: #a5d6ff;">&quot;AuthType&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;NONE&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">    <span style="color: #a5d6ff;">&quot;CreationTime&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;2023-10-19T10:18:56.838382Z&quot;</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<h2 id="testing-the-newly-created-function"><a href="#testing-the-newly-created-function">Testing the newly created function</a></h2>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">curl</span> <span style="color: #e6edf3;">-vIX</span> <span style="color: #e6edf3;">GET</span> <span style="color: #e6edf3;">https://wbjmhbkwx5ikumwxkhka4lcgle0nvjgs.lambda-url.eu-west-1.on.aws/</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;Message&quot;</span><span style="color: #a5d6ff;">:</span><span style="color: #a5d6ff;">&quot;Forbidden&quot;</span><span style="color: #a5d6ff;">&rbrace;</span><span style="color: #a5d6ff;">⏎</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;"></span>
</div></code></pre>
<p>It seems something is missing. Digging through the possible reasons I found the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Your function URL auth type is NONE, but is missing permissions required 
</div><div class="line" data-line="2">for public access. To allow unauthenticated requests, choose the Permissions 
</div><div class="line" data-line="3">tab and create a resource-based policy that grants lambda:invokeFunctionUrl 
</div><div class="line" data-line="4">permissions to all principals (*). Alternatively, you can update your function 
</div><div class="line" data-line="5">URL auth type to AWS_IAM to use IAM authentication.
</div></code></pre>
<p>Let's create the missing permission:</p>
<p><img src="/static/img/blog/function_url_permission.webp" alt="Function URL permission" /></p>
<p>The JSON version of the policy:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;Version&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;2012-10-17&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">&quot;Id&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;default&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">  <span style="color: #79c0ff;">&quot;Statement&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="5">    <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="6">      <span style="color: #79c0ff;">&quot;Sid&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;FunctionURLAllowPublicAccess&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="7">      <span style="color: #79c0ff;">&quot;Effect&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Allow&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">      <span style="color: #79c0ff;">&quot;Principal&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;*&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">      <span style="color: #79c0ff;">&quot;Action&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;lambda:InvokeFunctionUrl&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="10">      <span style="color: #79c0ff;">&quot;Resource&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;arn:aws:lambda:eu-west-1:651831719661:function:bun-lambda-function&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11">      <span style="color: #79c0ff;">&quot;Condition&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="12">        <span style="color: #79c0ff;">&quot;StringEquals&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="13">          <span style="color: #79c0ff;">&quot;lambda:FunctionUrlAuthType&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;NONE&quot;</span>
</div><div class="line" data-line="14">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="15">      <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="16">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="17">  <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="18"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>Finally, we can invoke the function:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">curl</span> <span style="color: #e6edf3;">-iX</span> <span style="color: #e6edf3;">GET</span> <span style="color: #e6edf3;">https://wbjmhbkwx5ikumwxkhka4lcgle0nvjgs.lambda-url.eu-west-1.on.aws/</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">HTTP/1.1</span> <span style="color: #79c0ff;">200</span> <span style="color: #e6edf3;">OK</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Date:</span> <span style="color: #e6edf3;">Thu,</span> <span style="color: #79c0ff;">19</span> <span style="color: #e6edf3;">Oct</span> <span style="color: #79c0ff;">2023</span> <span style="color: #e6edf3;">12:01:29</span> <span style="color: #e6edf3;">GMT</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Content-Type:</span> <span style="color: #e6edf3;">application/json</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Content-Length:</span> <span style="color: #79c0ff;">28</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">Connection:</span> <span style="color: #e6edf3;">keep-alive</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">x-amzn-RequestId:</span> <span style="color: #e6edf3;">cd204d16-01c9-4468-a02e-9f6d7d3ee2e1</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">X-Amzn-Trace-Id:</span> <span style="color: #e6edf3;">root=1-65311a98-3ead96f16ebd86bc6366d05b</span><span style="color: #e6edf3;">;</span><span style="color: #e6edf3;">sampled</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">;</span><span style="color: #e6edf3;">lineage</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">fc7d7573:0</span>
</div><div class="line" data-line="9">
</div><div class="line" data-line="10"><span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;Hello&quot;</span><span style="color: #a5d6ff;">:</span><span style="color: #a5d6ff;">&quot;from Bun runtime&quot;</span><span style="color: #a5d6ff;">&rbrace;</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;"></span>
</div></code></pre>
<p>Latency looks reasonable:</p>
<p><img src="/static/img/blog/bun_latency_1.webp" alt="Bun latency" /></p>
<p><img src="/static/img/blog/bun_latency_2.webp" alt="Bun latency" /></p>
<p>That’s all folks!</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Thu, 19 Oct 2023 09:11:13 +0100</pubDate>
    </item>
    <item>
      <title>How fast DuckDB can query Parquet files?</title>
      <link>https://dev.l1x.be/posts/2023/03/26/how-fast-duckdb-can-query-parquet-files/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/03/26/how-fast-duckdb-can-query-parquet-files/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/duckdb.webp" alt="LLaMA" /></p>
<h2 id="how-fast-duckdb-can-query-parquet-files"><a href="#how-fast-duckdb-can-query-parquet-files">How fast DuckDB can query Parquet files?</a></h2>
<p>I found mysqlf in the situation that I had to check few million lines of web logs in Parquet files and I have seen the raise of DuckDB in few articles already.</p>
<p>DuckDB is a high-performance, embedded analytical SQL database. It is designed to support efficient and fast querying of large datasets, including those stored in various file formats like Parquet. DuckDB can read Parquet files natively, without the need for conversion.</p>
<p>It is time to try it out!</p>
<h3 id="getting-started"><a href="#getting-started">Getting started</a></h3>
<p>Installing DuckDB works as expected, there is an installer for most platforms:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">brew</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">duckdb</span>
</div></code></pre>
<p>Version:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">duckdb</span> <span style="color: #e6edf3;">--version</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">v0.7.1</span> <span style="color: #e6edf3;">b00b93f0b1</span>
</div></code></pre>
<h3 id="querying-a-bunch-of-files"><a href="#querying-a-bunch-of-files">Querying a bunch of files</a></h3>
<p>The user interface is pretty similar to Sqlite3, because DuckDB builds on the top os Sqlite.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">duckdb</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">v0.7.1</span> <span style="color: #e6edf3;">b00b93f0b1</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Enter</span> <span style="color: #a5d6ff;">&quot;.help&quot;</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">usage</span> <span style="color: #e6edf3;">hints.</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Connected</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">transient</span> <span style="color: #e6edf3;">in-memory</span> <span style="color: #e6edf3;">database.</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Use</span> <span style="color: #a5d6ff;">&quot;.open FILENAME&quot;</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">reopen</span> <span style="color: #e6edf3;">on</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">persistent</span> <span style="color: #e6edf3;">database.</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">D</span>
</div></code></pre>
<p>The files are located in a folder. We have a file for each day.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">exa</span> <span style="color: #e6edf3;">-ah</span> <span style="color: #e6edf3;">*.parq</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">2023-03-01.gz.parq</span>  <span style="color: #e6edf3;">2023-03-13.gz.parq</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">2023-03-02.gz.parq</span>  <span style="color: #e6edf3;">2023-03-14.gz.parq</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">2023-03-03.gz.parq</span>  <span style="color: #e6edf3;">2023-03-15.gz.parq</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">2023-03-04.gz.parq</span>  <span style="color: #e6edf3;">2023-03-16.gz.parq</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">2023-03-05.gz.parq</span>  <span style="color: #e6edf3;">2023-03-17.gz.parq</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">2023-03-06.gz.parq</span>  <span style="color: #e6edf3;">2023-03-18.gz.parq</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">2023-03-07.gz.parq</span>  <span style="color: #e6edf3;">2023-03-19.gz.parq</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">2023-03-08.gz.parq</span>  <span style="color: #e6edf3;">2023-03-20.gz.parq</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">2023-03-09.gz.parq</span>  <span style="color: #e6edf3;">2023-03-21.gz.parq</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">2023-03-10.gz.parq</span>  <span style="color: #e6edf3;">2023-03-22.gz.parq</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">2023-03-11.gz.parq</span>  <span style="color: #e6edf3;">2023-03-23.gz.parq</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">2023-03-12.gz.parq</span>  <span style="color: #e6edf3;">2023-03-24.gz.parq</span>
</div></code></pre>
<p>Lets just get the name and type of the fields:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">D DESCRIBE SELECT * FROM read_parquet(&#39;*.gz.parq&#39;);
</div><div class="line" data-line="2">┌─────────────────────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐
</div><div class="line" data-line="3">│         column_name         │ column_type │  null   │   key   │ default │  extra  │
</div><div class="line" data-line="4">│           varchar           │   varchar   │ varchar │ varchar │ varchar │ varchar │
</div><div class="line" data-line="5">├─────────────────────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤
</div><div class="line" data-line="6">│ iso-datetime                │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="7">│ x-edge-location             │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="8">│ sc-bytes                    │ BIGINT      │ YES     │         │         │         │
</div><div class="line" data-line="9">│ c-ip                        │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="10">│ cs-method                   │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="11">│ cs-host                     │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="12">│ cs-uri-stem                 │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="13">│ sc-status                   │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="14">│ cs-referer                  │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="15">│ cs-user-agent               │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="16">│ cs-uri-query                │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="17">│ cs-cookie                   │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="18">│ x-edge-result-type          │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="19">│ x-edge-request-id           │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="20">│ x-host-header               │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="21">│ cs-protocol                 │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="22">│ cs-bytes                    │ BIGINT      │ YES     │         │         │         │
</div><div class="line" data-line="23">│ time-taken                  │ DOUBLE      │ YES     │         │         │         │
</div><div class="line" data-line="24">│ x-forwarded-for             │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="25">│ ssl-protocol                │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="26">│ ssl-cipher                  │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="27">│ x-edge-response-result-type │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="28">│ cs-protocol-version         │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="29">│ fle-status                  │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="30">│ fle-encrypted-fields        │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="31">│ c-port                      │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="32">│ time-to-first-byte          │ DOUBLE      │ YES     │         │         │         │
</div><div class="line" data-line="33">│ x-edge-detailed-result-type │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="34">│ sc-content-type             │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="35">│ sc-content-len              │ BIGINT      │ YES     │         │         │         │
</div><div class="line" data-line="36">│ sc-range-start              │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="37">│ sc-range-end                │ VARCHAR     │ YES     │         │         │         │
</div><div class="line" data-line="38">├─────────────────────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┤
</div><div class="line" data-line="39">│ 32 rows                                                                 6 columns │
</div><div class="line" data-line="40">└───────────────────────────────────────────────────────────────────────────────────┘
</div><div class="line" data-line="41">D
</div></code></pre>
<p>I would like to generate reader statistics by the URLs that start with posts. DuckDB has amazing support for working with strings beyond the basic SQL features. One way of filtering results is to use a GLOB.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">D SELECT &quot;cs-uri-stem&quot; FROM read_parquet(&#39;*.gz.parq&#39;)
</div><div class="line" data-line="2">WHERE &quot;cs-uri-stem&quot; GLOB &#39;/posts/*&#39; LIMIT 10;
</div><div class="line" data-line="3">┌──────────────────────────────────────────────────────────────────┐
</div><div class="line" data-line="4">│                           cs-uri-stem                            │
</div><div class="line" data-line="5">│                             varchar                              │
</div><div class="line" data-line="6">├──────────────────────────────────────────────────────────────────┤
</div><div class="line" data-line="7">│ /posts/2021/03/22/using-ldap-in-docker-with-caching/             │
</div><div class="line" data-line="8">│ /posts/2021/03/22/using-ldap-in-docker-with-caching              │
</div><div class="line" data-line="9">│ /posts/2020/05/08/why-i-chose-fsharp-for-our-aws-lambda-project/ │
</div><div class="line" data-line="10">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda/             │
</div><div class="line" data-line="11">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda              │
</div><div class="line" data-line="12">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda/             │
</div><div class="line" data-line="13">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda/             │
</div><div class="line" data-line="14">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda/             │
</div><div class="line" data-line="15">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda/             │
</div><div class="line" data-line="16">│ /posts/2023/02/28/using-python-3.11-with-aws-lambda/             │
</div><div class="line" data-line="17">├──────────────────────────────────────────────────────────────────┤
</div><div class="line" data-line="18">│                             10 rows                              │
</div><div class="line" data-line="19">└──────────────────────────────────────────────────────────────────┘
</div></code></pre>
<p>I would like to have the statistics so that the closing / does not matter and I am only interested in the last bit of the URL&quot;</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">SELECT COUNT(1) AS cnt,
</div><div class="line" data-line="2">split_part(rtrim(&quot;cs-uri-stem&quot;, &#39;/&#39;), &#39;/&#39;, 6) AS url
</div><div class="line" data-line="3">FROM read_parquet(&#39;*.gz.parq&#39;)
</div><div class="line" data-line="4">WHERE &quot;cs-uri-stem&quot; GLOB &#39;/posts/*&#39;
</div><div class="line" data-line="5">GROUP BY url
</div><div class="line" data-line="6">HAVING url &lt;&gt; &#39;&#39;
</div><div class="line" data-line="7">ORDER BY cnt DESC LIMIT 15;
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">┌────────┬──────────────────────────────────────────────────────────┐
</div><div class="line" data-line="10">│  cnt   │                           url                            │
</div><div class="line" data-line="11">│ int64  │                         varchar                          │
</div><div class="line" data-line="12">├────────┼──────────────────────────────────────────────────────────┤
</div><div class="line" data-line="13">│ 109023 │ using-llama-with-m1-mac                                  │
</div><div class="line" data-line="14">│   4372 │ using-python-3.11-with-aws-lambda                        │
</div><div class="line" data-line="15">│    711 │ misusing-ninja                                           │
</div><div class="line" data-line="16">│    287 │ beyond-the-borrow-checker                                │
</div><div class="line" data-line="17">│    142 │ getting-started-with-firecracker-on-raspberry-pi         │
</div><div class="line" data-line="18">│    135 │ new-tools                                                │
</div><div class="line" data-line="19">│     93 │ compressing-data-with-parquet                            │
</div><div class="line" data-line="20">│     90 │ running-asp.net-web-application-with-falco-on-aws-lambda │
</div><div class="line" data-line="21">│     89 │ using-ldap-in-docker-with-caching                        │
</div><div class="line" data-line="22">│     83 │ diving-into-firecracker-with-alpine                      │
</div><div class="line" data-line="23">│     80 │ compressing-aws-s3-logs-after-getting-hackernewsed       │
</div><div class="line" data-line="24">│     30 │ why-i-chose-fsharp-for-our-aws-lambda-project            │
</div><div class="line" data-line="25">│     25 │ how-long-will-the-worlds-uranium-supplies-last           │
</div><div class="line" data-line="26">│     23 │ matching-binary-patterns                                 │
</div><div class="line" data-line="27">│     18 │ freenas-11.3-upgrade-issues                              │
</div><div class="line" data-line="28">├────────┴──────────────────────────────────────────────────────────┤
</div><div class="line" data-line="29">│ 15 rows                                                 2 columns │
</div><div class="line" data-line="30">└───────────────────────────────────────────────────────────────────┘
</div></code></pre>
<p>I was wondering what DuckDB does under the hood:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">EXPLAIN ...
</div><div class="line" data-line="2">┌─────────────────────────────┐
</div><div class="line" data-line="3">│┌───────────────────────────┐│
</div><div class="line" data-line="4">││       Physical Plan       ││
</div><div class="line" data-line="5">│└───────────────────────────┘│
</div><div class="line" data-line="6">└─────────────────────────────┘
</div><div class="line" data-line="7">┌───────────────────────────┐
</div><div class="line" data-line="8">│           TOP_N           │
</div><div class="line" data-line="9">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="10">│           Top 15          │
</div><div class="line" data-line="11">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="12">│       count(1) DESC       │
</div><div class="line" data-line="13">└─────────────┬─────────────┘
</div><div class="line" data-line="14">┌─────────────┴─────────────┐
</div><div class="line" data-line="15">│         PROJECTION        │
</div><div class="line" data-line="16">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="17">│            cnt            │
</div><div class="line" data-line="18">│             1             │
</div><div class="line" data-line="19">└─────────────┬─────────────┘
</div><div class="line" data-line="20">┌─────────────┴─────────────┐
</div><div class="line" data-line="21">│       HASH_GROUP_BY       │
</div><div class="line" data-line="22">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="23">│             #0            │
</div><div class="line" data-line="24">│        count_star()       │
</div><div class="line" data-line="25">└─────────────┬─────────────┘
</div><div class="line" data-line="26">┌─────────────┴─────────────┐
</div><div class="line" data-line="27">│         PROJECTION        │
</div><div class="line" data-line="28">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="29">│            url            │
</div><div class="line" data-line="30">└─────────────┬─────────────┘
</div><div class="line" data-line="31">┌─────────────┴─────────────┐
</div><div class="line" data-line="32">│           FILTER          │
</div><div class="line" data-line="33">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="34">│((cs-uri-stem ~~~ &#39;/posts/*│
</div><div class="line" data-line="35">│      &#39;) AND (COALESCE     │
</div><div class="line" data-line="36">│(array_extract(string_...  │
</div><div class="line" data-line="37">│-uri-stem, &#39;/&#39;), &#39;/&#39;),...  │
</div><div class="line" data-line="38">│         ) != &#39;&#39;))         │
</div><div class="line" data-line="39">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="40">│           EC: 0           │
</div><div class="line" data-line="41">└─────────────┬─────────────┘
</div><div class="line" data-line="42">┌─────────────┴─────────────┐
</div><div class="line" data-line="43">│        PARQUET_SCAN       │
</div><div class="line" data-line="44">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="45">│        cs-uri-stem        │
</div><div class="line" data-line="46">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="47">│           EC: 0           │
</div><div class="line" data-line="48">└───────────────────────────┘
</div></code></pre>
<p>DuckDB is pretty great at reading Parquet files but how about writing?</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">D COPY (SELECT * FROM read_parquet(&#39;*.gz.parq&#39;))
</div><div class="line" data-line="2">TO &#39;output.parquet&#39; (FORMAT PARQUET, CODEC &#39;ZSTD&#39;);
</div><div class="line" data-line="3">100% ▕████████████████████████████████████████████████████████████▏
</div></code></pre>
<p>It does not dissapoint.</p>
<p>What about the performance? There is a built in way to analyze performance.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">EXPLAIN ANALYZE
</div><div class="line" data-line="2">COPY (SELECT * FROM read_parquet(&#39;*.gz.parq&#39;))
</div><div class="line" data-line="3">TO &#39;output.parquet&#39; (FORMAT PARQUET, CODEC &#39;ZSTD&#39;);
</div><div class="line" data-line="4">┌─────────────────────────────────────┐
</div><div class="line" data-line="5">│┌───────────────────────────────────┐│
</div><div class="line" data-line="6">││         Total Time: 4.91s         ││
</div><div class="line" data-line="7">│└───────────────────────────────────┘│
</div><div class="line" data-line="8">└─────────────────────────────────────┘
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">D EXPLAIN ANALYZE
</div><div class="line" data-line="2">SELECT COUNT(1) AS cnt,
</div><div class="line" data-line="3">split_part(rtrim(&quot;cs-uri-stem&quot;, &#39;/&#39;), &#39;/&#39;, 6) AS url
</div><div class="line" data-line="4">FROM read_parquet(&#39;*.gz.parq&#39;)
</div><div class="line" data-line="5">WHERE &quot;cs-uri-stem&quot; GLOB &#39;/posts/*&#39;
</div><div class="line" data-line="6">GROUP BY url
</div><div class="line" data-line="7">HAVING url &lt;&gt; &#39;&#39;
</div><div class="line" data-line="8">ORDER BY cnt
</div><div class="line" data-line="9">DESC LIMIT 15;
</div><div class="line" data-line="10">
</div><div class="line" data-line="11">┌─────────────────────────────┐
</div><div class="line" data-line="12">│┌───────────────────────────┐│
</div><div class="line" data-line="13">│└───────────────────────────┘│
</div><div class="line" data-line="14">└─────────────────────────────┘
</div><div class="line" data-line="15">┌─────────────────────────────────────┐
</div><div class="line" data-line="16">│┌───────────────────────────────────┐│
</div><div class="line" data-line="17">││    Query Profiling Information    ││
</div><div class="line" data-line="18">│└───────────────────────────────────┘│
</div><div class="line" data-line="19">└─────────────────────────────────────┘
</div><div class="line" data-line="20">EXPLAIN ANALYZE
</div><div class="line" data-line="21">SELECT
</div><div class="line" data-line="22">    COUNT(1) AS cnt,
</div><div class="line" data-line="23">    split_part(rtrim(&quot;cs-uri-stem&quot;, &#39;/&#39;), &#39;/&#39;, 6) AS url
</div><div class="line" data-line="24">    FROM read_parquet(&#39;*.gz.parq&#39;)
</div><div class="line" data-line="25">    WHERE &quot;cs-uri-stem&quot; GLOB &#39;/posts/*&#39;
</div><div class="line" data-line="26">    GROUP BY url
</div><div class="line" data-line="27">    HAVING url &lt;&gt; &#39;&#39;
</div><div class="line" data-line="28">    ORDER BY cnt DESC LIMIT 15;
</div><div class="line" data-line="29">┌─────────────────────────────────────┐
</div><div class="line" data-line="30">│┌───────────────────────────────────┐│
</div><div class="line" data-line="31">││        Total Time: 0.0727s        ││
</div><div class="line" data-line="32">│└───────────────────────────────────┘│
</div><div class="line" data-line="33">└─────────────────────────────────────┘
</div><div class="line" data-line="34">┌───────────────────────────┐
</div><div class="line" data-line="35">│      EXPLAIN_ANALYZE      │
</div><div class="line" data-line="36">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="37">│             0             │
</div><div class="line" data-line="38">│          (0.00s)          │
</div><div class="line" data-line="39">└─────────────┬─────────────┘
</div><div class="line" data-line="40">┌─────────────┴─────────────┐
</div><div class="line" data-line="41">│           TOP_N           │
</div><div class="line" data-line="42">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="43">│           Top 15          │
</div><div class="line" data-line="44">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="45">│       count(1) DESC       │
</div><div class="line" data-line="46">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="47">│             15            │
</div><div class="line" data-line="48">│          (0.00s)          │
</div><div class="line" data-line="49">└─────────────┬─────────────┘
</div><div class="line" data-line="50">┌─────────────┴─────────────┐
</div><div class="line" data-line="51">│         PROJECTION        │
</div><div class="line" data-line="52">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="53">│            cnt            │
</div><div class="line" data-line="54">│             1             │
</div><div class="line" data-line="55">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="56">│             46            │
</div><div class="line" data-line="57">│          (0.00s)          │
</div><div class="line" data-line="58">└─────────────┬─────────────┘
</div><div class="line" data-line="59">┌─────────────┴─────────────┐
</div><div class="line" data-line="60">│       HASH_GROUP_BY       │
</div><div class="line" data-line="61">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="62">│             #0            │
</div><div class="line" data-line="63">│        count_star()       │
</div><div class="line" data-line="64">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="65">│             46            │
</div><div class="line" data-line="66">│          (0.01s)          │
</div><div class="line" data-line="67">└─────────────┬─────────────┘
</div><div class="line" data-line="68">┌─────────────┴─────────────┐
</div><div class="line" data-line="69">│         PROJECTION        │
</div><div class="line" data-line="70">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="71">│            url            │
</div><div class="line" data-line="72">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="73">│           115262          │
</div><div class="line" data-line="74">│          (0.07s)          │
</div><div class="line" data-line="75">└─────────────┬─────────────┘
</div><div class="line" data-line="76">┌─────────────┴─────────────┐
</div><div class="line" data-line="77">│           FILTER          │
</div><div class="line" data-line="78">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="79">│((cs-uri-stem ~~~ &#39;/posts/*│
</div><div class="line" data-line="80">│      &#39;) AND (COALESCE     │
</div><div class="line" data-line="81">│(array_extract(string_...  │
</div><div class="line" data-line="82">│-uri-stem, &#39;/&#39;), &#39;/&#39;),...  │
</div><div class="line" data-line="83">│         ) != &#39;&#39;))         │
</div><div class="line" data-line="84">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="85">│           EC: 0           │
</div><div class="line" data-line="86">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="87">│           115262          │
</div><div class="line" data-line="88">│          (0.11s)          │
</div><div class="line" data-line="89">└─────────────┬─────────────┘
</div><div class="line" data-line="90">┌─────────────┴─────────────┐
</div><div class="line" data-line="91">│        PARQUET_SCAN       │
</div><div class="line" data-line="92">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="93">│        cs-uri-stem        │
</div><div class="line" data-line="94">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="95">│           EC: 0           │
</div><div class="line" data-line="96">│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
</div><div class="line" data-line="97">│           563607          │
</div><div class="line" data-line="98">│          (0.04s)          │
</div><div class="line" data-line="99">└───────────────────────────┘
</div></code></pre>
<p>This is insane performance for a M1 Mac for an average dataset.</p>
<p>It is a really nice way of working with a local tool without too much hassle. The performance that you can get is amazing.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 26 Mar 2023 15:36:01 +0100</pubDate>
    </item>
    <item>
      <title>Firecrackers</title>
      <link>https://dev.l1x.be/ai/firecrackers/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/firecrackers/</guid>
      <content:encoded><![CDATA[<p>Kids playing with firecrackers, Japanese, illustration, pencil drawing, 1800, realistic, 8k, greyscale --chaos 88 --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 26 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Space duck</title>
      <link>https://dev.l1x.be/ai/space-duck/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/space-duck/</guid>
      <content:encoded><![CDATA[<p>Space duck illustration by Robert McCall and sergio toppi, grey and orange colours, top view, zoomed out --chaos 99 --ar 16:9 --quality 2 --v 5</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 26 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Tech Drawing</title>
      <link>https://dev.l1x.be/ai/tech-drawing/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/tech-drawing/</guid>
      <content:encoded><![CDATA[<p>technical drawing, blueprint, japanese science-fiction advanced spaceship, bright, light, white and grey, industrial scifi, military design, greyscale --ar 16:9 --quality 2 --v 5</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Thu, 16 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Revue</title>
      <link>https://dev.l1x.be/ai/revue/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/revue/</guid>
      <content:encoded><![CDATA[<p>Folies Bergere La Revue de l’Amour, photorealistic, high resolution, 8k --ar 16:9 --quality 2 --v 5</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Thu, 16 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Using LLaMA with M1 Mac</title>
      <link>https://dev.l1x.be/posts/2023/03/12/using-llama-with-m1-mac/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/03/12/using-llama-with-m1-mac/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/llama.webp" alt="LLaMA" /></p>
<h2 id="the-large-language-models-wars"><a href="#the-large-language-models-wars">The large language models wars</a></h2>
<p>With the increasing interest in artificial intelligence and its use in everyday life, numerous exemplary models such as Meta's LLaMA, OpenAI's GPT-3, and Microsoft's Kosmos-1 are joining the group of large language models (LLMs). The only problem with such models is the you can't run these locally. Up until now. Thanks to <a href="https://www.linkedin.com/in/georgi-gerganov-b230ab24">Georgi Gerganov</a> and his <a href="https://github.com/ggerganov/llama.cpp">llama.cpp</a> project it is possible to run Meta's LLaMA on a single computer without a dedicated GPU.</p>
<h2 id="running-llama"><a href="#running-llama">Running LLaMA</a></h2>
<p>There are multiple steps involved in running LLaMA locally on a M1 Mac. I am not sure about other platforms or other OSes so in this article we are focusing only the aforementioned combination.</p>
<h3 id="step-1-downloading-the-model"><a href="#step-1-downloading-the-model">Step 1: Downloading the model</a></h3>
<p>The official way is to request the model via <a href="https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform?usp=send_form">this</a> web form and download it afterward.</p>
<p>There is a PR open in the repository, that describes an alternative way (that is probably a violation of the terms of service).</p>
<p><a href="https://github.com/facebookresearch/llama/pull/73">https://github.com/facebookresearch/llama/pull/73</a></p>
<p>Anyways, after you downloaded the model (or more like models because there are a few different kinds of models in the folder) you should have something like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">exa</span> <span style="color: #e6edf3;">--tree</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">.</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">├──</span> <span style="color: #e6edf3;">7B</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">checklist.chk</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.00.pth</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">└──</span> <span style="color: #e6edf3;">params.json</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">├──</span> <span style="color: #e6edf3;">13B</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">checklist.chk</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.00.pth</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.01.pth</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">└──</span> <span style="color: #e6edf3;">params.json</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">├──</span> <span style="color: #e6edf3;">30B</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">checklist.chk</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.00.pth</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.01.pth</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.02.pth</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.03.pth</span>
</div><div class="line" data-line="18"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">└──</span> <span style="color: #e6edf3;">params.json</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">├──</span> <span style="color: #e6edf3;">65B</span>
</div><div class="line" data-line="20"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">checklist.chk</span>
</div><div class="line" data-line="21"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.00.pth</span>
</div><div class="line" data-line="22"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.01.pth</span>
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.02.pth</span>
</div><div class="line" data-line="24"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.03.pth</span>
</div><div class="line" data-line="25"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.04.pth</span>
</div><div class="line" data-line="26"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.05.pth</span>
</div><div class="line" data-line="27"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.06.pth</span>
</div><div class="line" data-line="28"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">├──</span> <span style="color: #e6edf3;">consolidated.07.pth</span>
</div><div class="line" data-line="29"><span style="color: #d2a8ff;">│</span>  <span style="color: #e6edf3;">└──</span> <span style="color: #e6edf3;">params.json</span>
</div><div class="line" data-line="30"><span style="color: #d2a8ff;">├──</span> <span style="color: #e6edf3;">tokenizer.model</span>
</div><div class="line" data-line="31"><span style="color: #d2a8ff;">└──</span> <span style="color: #e6edf3;">tokenizer_checklist.chk</span>
</div></code></pre>
<p>As you can see the different models are in a different folders. Each model has a params.json that contains details about the model.</p>
<p>For example:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;dim&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">&quot;multiple_of&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">256</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">  <span style="color: #79c0ff;">&quot;n_heads&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">32</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">  <span style="color: #79c0ff;">&quot;n_layers&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">32</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">  <span style="color: #79c0ff;">&quot;norm_eps&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">1e-06</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="7">  <span style="color: #79c0ff;">&quot;vocab_size&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">-1</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<h3 id="step-2-installing-dependencies"><a href="#step-2-installing-dependencies">Step 2: Installing dependencies</a></h3>
<p>Xcode must be installed to compile the C++ project. If you don't have it, please do the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">xcode-select</span> <span style="color: #e6edf3;">--install</span>
</div></code></pre>
<p>These are the dependencies for building the C++ project (pkgconfigand cmake).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">brew</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">pkgconfig</span> <span style="color: #e6edf3;">cmake</span>
</div></code></pre>
<p>Finally, we can install Torch.</p>
<p>I assume you have Python 3.11 installed so you can create a virtual env like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">/opt/homebrew/bin/python3.11</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">venv</span> <span style="color: #e6edf3;">venv</span>
</div></code></pre>
<p>Activating the venv. I am using fish. For other shells just drop the .fish suffix.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">.</span> <span style="color: #e6edf3;">venv/bin/activate.fish</span>
</div></code></pre>
<p>After activated the venv we can install Pytorch:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">pip3</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">--pre</span> <span style="color: #e6edf3;">torch</span> <span style="color: #e6edf3;">torchvision</span> <span style="color: #e6edf3;">--extra-index-url</span> <span style="color: #e6edf3;">https://download.pytorch.org/whl/nightly/cpu</span>
</div></code></pre>
<p>If you are interesting leveraging the new <a href="https://developer.apple.com/metal/pytorch/">Metal Performance Shaders (MPS) backend</a> for GPU training acceleration you can verify it by running the following. This is not required for running LLaMA on you M1 though:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">python</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Python</span> 3.11.2 <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">main,</span> <span style="color: #e6edf3;">Feb</span> <span style="color: #79c0ff;">16</span> <span style="color: #e6edf3;">2023,</span> <span style="color: #e6edf3;">02:55:59</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">Clang</span> 14.0.0 <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">clang-1400.0.29.202</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">on</span> <span style="color: #e6edf3;">darwin</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">Type</span> <span style="color: #a5d6ff;">&quot;help&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;copyright&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;credits&quot;</span> <span style="color: #e6edf3;">or</span> <span style="color: #a5d6ff;">&quot;license&quot;</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">more</span> <span style="color: #e6edf3;">information.</span>
</div><div class="line" data-line="4"><span style="color: #79c0ff;">&gt;&gt;</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">import</span> <span style="color: #e6edf3;">torch</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">;</span> <span style="color: #d2a8ff;">torch.backends.mps.is_available</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">True</span>
</div></code></pre>
<p>Now lets compile llama.cpp.</p>
<h3 id="step-3-compile-llama-cpp"><a href="#step-3-compile-llama-cpp">Step 3: Compile LLaMA CPP</a></h3>
<p>Cloning the repo:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">git</span> <span style="color: #e6edf3;">clone</span> <span style="color: #e6edf3;">git@github.com:ggerganov/llama.cpp.git</span>
</div></code></pre>
<p>After installing all the dependencies you can run make:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">make</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">llama.cpp</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">info:</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">UNAME_S:</span>  <span style="color: #e6edf3;">Darwin</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">UNAME_P:</span>  <span style="color: #e6edf3;">arm</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">UNAME_M:</span>  <span style="color: #e6edf3;">arm64</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">CFLAGS:</span>   <span style="color: #e6edf3;">-I.</span>              <span style="color: #e6edf3;">-O3</span> <span style="color: #e6edf3;">-DNDEBUG</span> <span style="color: #e6edf3;">-std=c11</span>   <span style="color: #e6edf3;">-fPIC</span> <span style="color: #e6edf3;">-pthread</span> <span style="color: #e6edf3;">-DGGML_USE_ACCELERATE</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">CXXFLAGS:</span> <span style="color: #e6edf3;">-I.</span> <span style="color: #e6edf3;">-I./examples</span> <span style="color: #e6edf3;">-O3</span> <span style="color: #e6edf3;">-DNDEBUG</span> <span style="color: #e6edf3;">-std=c++11</span> <span style="color: #e6edf3;">-fPIC</span> <span style="color: #e6edf3;">-pthread</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">LDFLAGS:</span>   <span style="color: #e6edf3;">-framework</span> <span style="color: #e6edf3;">Accelerate</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">CC:</span>       <span style="color: #e6edf3;">Apple</span> <span style="color: #e6edf3;">clang</span> <span style="color: #e6edf3;">version</span> <span style="color: #e6edf3;">14.0.0</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">clang-1400.0.29.202</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">I</span> <span style="color: #e6edf3;">CXX:</span>      <span style="color: #e6edf3;">Apple</span> <span style="color: #e6edf3;">clang</span> <span style="color: #e6edf3;">version</span> <span style="color: #e6edf3;">14.0.0</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">clang-1400.0.29.202</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="11">
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">cc</span>  <span style="color: #e6edf3;">-I.</span>              <span style="color: #e6edf3;">-O3</span> <span style="color: #e6edf3;">-DNDEBUG</span> <span style="color: #e6edf3;">-std=c11</span>   <span style="color: #e6edf3;">-fPIC</span> <span style="color: #e6edf3;">-pthread</span> <span style="color: #e6edf3;">-DGGML_USE_ACCELERATE</span>   <span style="color: #e6edf3;">-c</span> <span style="color: #e6edf3;">ggml.c</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">ggml.o</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">c++</span> <span style="color: #e6edf3;">-I.</span> <span style="color: #e6edf3;">-I./examples</span> <span style="color: #e6edf3;">-O3</span> <span style="color: #e6edf3;">-DNDEBUG</span> <span style="color: #e6edf3;">-std=c++11</span> <span style="color: #e6edf3;">-fPIC</span> <span style="color: #e6edf3;">-pthread</span> <span style="color: #e6edf3;">-c</span> <span style="color: #e6edf3;">utils.cpp</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">utils.o</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">c++</span> <span style="color: #e6edf3;">-I.</span> <span style="color: #e6edf3;">-I./examples</span> <span style="color: #e6edf3;">-O3</span> <span style="color: #e6edf3;">-DNDEBUG</span> <span style="color: #e6edf3;">-std=c++11</span> <span style="color: #e6edf3;">-fPIC</span> <span style="color: #e6edf3;">-pthread</span> <span style="color: #e6edf3;">main.cpp</span> <span style="color: #e6edf3;">ggml.o</span> <span style="color: #e6edf3;">utils.o</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">main</span>  <span style="color: #e6edf3;">-framework</span> <span style="color: #e6edf3;">Accelerate</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">./main</span> <span style="color: #e6edf3;">-h</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">usage:</span> <span style="color: #e6edf3;">./main</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">options</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="17">
</div><div class="line" data-line="18"><span style="color: #d2a8ff;">options:</span>
</div><div class="line" data-line="19">  <span style="color: #d2a8ff;">-h,</span> <span style="color: #e6edf3;">--help</span>            <span style="color: #e6edf3;">show</span> <span style="color: #e6edf3;">this</span> <span style="color: #e6edf3;">help</span> <span style="color: #e6edf3;">message</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">exit</span>
</div><div class="line" data-line="20">  <span style="color: #d2a8ff;">-s</span> <span style="color: #e6edf3;">SEED,</span> <span style="color: #e6edf3;">--seed</span> <span style="color: #e6edf3;">SEED</span>  <span style="color: #e6edf3;">RNG</span> <span style="color: #e6edf3;">seed</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #79c0ff;">-1</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="21">  <span style="color: #d2a8ff;">-t</span> <span style="color: #e6edf3;">N,</span> <span style="color: #e6edf3;">--threads</span> <span style="color: #e6edf3;">N</span>     <span style="color: #e6edf3;">number</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">threads</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">use</span> <span style="color: #e6edf3;">during</span> <span style="color: #e6edf3;">computation</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #79c0ff;">4</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="22">  <span style="color: #d2a8ff;">-p</span> <span style="color: #e6edf3;">PROMPT,</span> <span style="color: #e6edf3;">--prompt</span> <span style="color: #e6edf3;">PROMPT</span>
</div><div class="line" data-line="23">                        <span style="color: #d2a8ff;">prompt</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">start</span> <span style="color: #e6edf3;">generation</span> <span style="color: #e6edf3;">with</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #e6edf3;">random</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="24">  <span style="color: #d2a8ff;">-n</span> <span style="color: #e6edf3;">N,</span> <span style="color: #e6edf3;">--n_predict</span> <span style="color: #e6edf3;">N</span>   <span style="color: #e6edf3;">number</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">tokens</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">predict</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #79c0ff;">128</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="25">  <span style="color: #d2a8ff;">--top_k</span> <span style="color: #e6edf3;">N</span>             <span style="color: #e6edf3;">top-k</span> <span style="color: #e6edf3;">sampling</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #79c0ff;">40</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="26">  <span style="color: #d2a8ff;">--top_p</span> <span style="color: #e6edf3;">N</span>             <span style="color: #e6edf3;">top-p</span> <span style="color: #e6edf3;">sampling</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #e6edf3;">0.9</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="27">  <span style="color: #d2a8ff;">--temp</span> <span style="color: #e6edf3;">N</span>              <span style="color: #e6edf3;">temperature</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #e6edf3;">0.8</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="28">  <span style="color: #d2a8ff;">-b</span> <span style="color: #e6edf3;">N,</span> <span style="color: #e6edf3;">--batch_size</span> <span style="color: #e6edf3;">N</span>  <span style="color: #e6edf3;">batch</span> <span style="color: #e6edf3;">size</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">prompt</span> <span style="color: #e6edf3;">processing</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #79c0ff;">8</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="29">  <span style="color: #d2a8ff;">-m</span> <span style="color: #e6edf3;">FNAME,</span> <span style="color: #e6edf3;">--model</span> <span style="color: #e6edf3;">FNAME</span>
</div><div class="line" data-line="30">                        <span style="color: #d2a8ff;">model</span> path <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">default:</span> <span style="color: #e6edf3;">models/llama-7B/ggml-model.bin</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="31">
</div><div class="line" data-line="32"><span style="color: #d2a8ff;">c++</span> <span style="color: #e6edf3;">-I.</span> <span style="color: #e6edf3;">-I./examples</span> <span style="color: #e6edf3;">-O3</span> <span style="color: #e6edf3;">-DNDEBUG</span> <span style="color: #e6edf3;">-std=c++11</span> <span style="color: #e6edf3;">-fPIC</span> <span style="color: #e6edf3;">-pthread</span> <span style="color: #e6edf3;">quantize.cpp</span> <span style="color: #e6edf3;">ggml.o</span> <span style="color: #e6edf3;">utils.o</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">quantize</span>  <span style="color: #e6edf3;">-framework</span> <span style="color: #e6edf3;">Accelerate</span>
</div></code></pre>
<h2 id="step-4-converting-the-model"><a href="#step-4-converting-the-model">Step 4: Converting the model</a></h2>
<p>Assuming you placed the models under models/ in the llama.cpp repo.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">python</span> <span style="color: #e6edf3;">convert-pth-to-ggml.py</span> <span style="color: #e6edf3;">models/7B</span> <span style="color: #79c0ff;">1</span>
</div></code></pre>
<p>You should see an output like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&#39;dim&#39;</span><span style="color: #a5d6ff;">:</span> <span style="color: #e6edf3;">4096,</span> <span style="color: #a5d6ff;">&#39;multiple_of&#39;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">256,</span> <span style="color: #a5d6ff;">&#39;n_heads&#39;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">32,</span> <span style="color: #a5d6ff;">&#39;n_layers&#39;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">32,</span> <span style="color: #a5d6ff;">&#39;norm_eps&#39;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">1e-06,</span> <span style="color: #a5d6ff;">&#39;vocab_size&#39;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">32000</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">n_parts</span> <span style="color: #e6edf3;">=</span>  <span style="color: #79c0ff;">1</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Processing</span> <span style="color: #e6edf3;">part</span>  <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">tok_embeddings.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">32000,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.float16</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">norm.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #d2a8ff;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.float16</span>
</div><div class="line" data-line="6">  <span style="color: #e6edf3;">Converting</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">float32</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">output.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">32000,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.float16</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.attention.wq.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">4096,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.f</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">loat16</span>
</div><div class="line" data-line="10"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.attention.wk.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">4096,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.f</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">loat16</span>
</div><div class="line" data-line="12"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.attention.wv.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">4096,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.f</span>
</div><div class="line" data-line="13"><span style="color: #e6edf3;">loat16</span>
</div><div class="line" data-line="14"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.attention.wo.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">4096,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.f</span>
</div><div class="line" data-line="15"><span style="color: #e6edf3;">loat16</span>
</div><div class="line" data-line="16"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.feed_forward.w1.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">11008,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">tor</span>
</div><div class="line" data-line="17"><span style="color: #e6edf3;">ch.float16</span>
</div><div class="line" data-line="18"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.feed_forward.w2.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">4096,</span> <span style="color: #79c0ff;">11008</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">tor</span>
</div><div class="line" data-line="19"><span style="color: #e6edf3;">ch.float16</span>
</div><div class="line" data-line="20"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.feed_forward.w3.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">11008,</span> <span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">tor</span>
</div><div class="line" data-line="21"><span style="color: #e6edf3;">ch.float16</span>
</div><div class="line" data-line="22"><span style="color: #e6edf3;">Processing</span> <span style="color: #e6edf3;">variable:</span> <span style="color: #e6edf3;">layers.0.attention_norm.weight</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">shape:</span>  <span style="color: #e6edf3;">torch.Size</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">4096</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">)</span>  <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">type:</span>  <span style="color: #e6edf3;">torch.float</span>
</div><div class="line" data-line="23"><span style="color: #79c0ff;">16</span>
</div><div class="line" data-line="24"><span style="color: #e6edf3;">...</span>
</div><div class="line" data-line="25"><span style="color: #e6edf3;">Done.</span> <span style="color: #e6edf3;">Output</span> <span style="color: #e6edf3;">file:</span> <span style="color: #e6edf3;">models/7B/ggml-model-f16.bin,</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">part</span>  <span style="color: #79c0ff;">0</span> <span style="color: #e6edf3;">)</span>
</div></code></pre>
<p>The next step would be to perform the quantizing:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./quantize</span> <span style="color: #e6edf3;">./models/7B/ggml-model-f16.bin</span> <span style="color: #e6edf3;">./models/7B/ggml-model-q4_0.bin</span> <span style="color: #79c0ff;">2</span>
</div></code></pre>
<p>Output:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">loading</span> <span style="color: #e6edf3;">model</span> <span style="color: #e6edf3;">from</span> <span style="color: #a5d6ff;">&#39;./models/7B/ggml-model-f16.bin&#39;</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">n_vocab</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">32000</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">n_ctx</span>   <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">512</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">n_embd</span>  <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">4096</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">n_mult</span>  <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">256</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">n_head</span>  <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">32</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">n_layer</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">32</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">f16</span>     <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">1</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">...</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">layers.31.attention_norm.weight</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">4096,</span>     <span style="color: #79c0ff;">1</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">type</span> <span style="color: #e6edf3;">=</span>    <span style="color: #e6edf3;">f32</span> <span style="color: #e6edf3;">size</span> <span style="color: #e6edf3;">=</span>    <span style="color: #e6edf3;">0.016</span> <span style="color: #e6edf3;">MB</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">layers.31.ffn_norm.weight</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">4096,</span>     <span style="color: #79c0ff;">1</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">type</span> <span style="color: #e6edf3;">=</span>    <span style="color: #e6edf3;">f32</span> <span style="color: #e6edf3;">size</span> <span style="color: #e6edf3;">=</span>    <span style="color: #e6edf3;">0.016</span> <span style="color: #e6edf3;">MB</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">model</span> <span style="color: #e6edf3;">size</span>  <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">25705.02</span> <span style="color: #e6edf3;">MB</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">quant</span> <span style="color: #e6edf3;">size</span>  <span style="color: #e6edf3;">=</span>  <span style="color: #e6edf3;">4017.27</span> <span style="color: #e6edf3;">MB</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">llama_model_quantize:</span> <span style="color: #e6edf3;">hist:</span> <span style="color: #e6edf3;">0.000</span> <span style="color: #e6edf3;">0.022</span> <span style="color: #e6edf3;">0.019</span> <span style="color: #e6edf3;">0.033</span> <span style="color: #e6edf3;">0.053</span> <span style="color: #e6edf3;">0.078</span> <span style="color: #e6edf3;">0.104</span> <span style="color: #e6edf3;">0.125</span> <span style="color: #e6edf3;">0.134</span> <span style="color: #e6edf3;">0.125</span> <span style="color: #e6edf3;">0.104</span> <span style="color: #e6edf3;">0.078</span> <span style="color: #e6edf3;">0.053</span> <span style="color: #e6edf3;">0.033</span> <span style="color: #e6edf3;">0.019</span> <span style="color: #e6edf3;">0.022</span>
</div><div class="line" data-line="15">
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">main:</span> <span style="color: #e6edf3;">quantize</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">29389.45</span> <span style="color: #e6edf3;">ms</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">main:</span>    <span style="color: #e6edf3;">total</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">29389.45</span> <span style="color: #e6edf3;">ms</span>
</div></code></pre>
<h3 id="step5-running-the-model"><a href="#step5-running-the-model">Step5: Running the model</a></h3>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">./main</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">./models/7B/ggml-model-q4_0.bin</span> \
</div><div class="line" data-line="2">        <span style="color: #e6edf3;">-t</span> <span style="color: #79c0ff;">8</span> \
</div><div class="line" data-line="3">        <span style="color: #e6edf3;">-n</span> <span style="color: #79c0ff;">128</span> \
</div><div class="line" data-line="4">        <span style="color: #e6edf3;">-p</span> <span style="color: #a5d6ff;">&#39;The first president of the USA was &#39;</span>
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">main:</span> <span style="color: #e6edf3;">seed</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">1678615879</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">loading</span> <span style="color: #e6edf3;">model</span> <span style="color: #e6edf3;">from</span> <span style="color: #a5d6ff;">&#39;./models/7B/ggml-model-q4_0.bin&#39;</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">please</span> <span style="color: #e6edf3;">wait</span> <span style="color: #e6edf3;">...</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_vocab</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">32000</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_ctx</span>   <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">512</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_embd</span>  <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">4096</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_mult</span>  <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">256</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_head</span>  <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">32</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_layer</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">32</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_rot</span>   <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">128</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">f16</span>     <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">2</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_ff</span>    <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">11008</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">n_parts</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">1</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">ggml</span> <span style="color: #e6edf3;">ctx</span> <span style="color: #e6edf3;">size</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">4529.34</span> <span style="color: #e6edf3;">MB</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">memory_size</span> <span style="color: #e6edf3;">=</span>   <span style="color: #e6edf3;">512.00</span> <span style="color: #e6edf3;">MB,</span> <span style="color: #e6edf3;">n_mem</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">16384</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">loading</span> <span style="color: #e6edf3;">model</span> <span style="color: #e6edf3;">part</span> <span style="color: #e6edf3;">1/1</span> <span style="color: #e6edf3;">from</span> <span style="color: #a5d6ff;">&#39;./models/7B/ggml-model-q4_0.bin&#39;</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">....................................</span> <span style="color: #e6edf3;">done</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">llama_model_load:</span> <span style="color: #e6edf3;">model</span> <span style="color: #e6edf3;">size</span> <span style="color: #e6edf3;">=</span>  <span style="color: #e6edf3;">4017.27</span> <span style="color: #e6edf3;">MB</span> <span style="color: #e6edf3;">/</span> <span style="color: #e6edf3;">num</span> <span style="color: #e6edf3;">tensors</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">291</span>
</div><div class="line" data-line="18">
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">main:</span> <span style="color: #e6edf3;">prompt:</span> <span style="color: #a5d6ff;">&#39;The first president of the USA was &#39;</span>
</div><div class="line" data-line="20"><span style="color: #d2a8ff;">main:</span> <span style="color: #e6edf3;">number</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">tokens</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">prompt</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">9</span>
</div><div class="line" data-line="21">     <span style="color: #79c0ff;">1</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39;&#39;</span>
</div><div class="line" data-line="22">  <span style="color: #79c0ff;">1576</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39;The&#39;</span>
</div><div class="line" data-line="23">   <span style="color: #79c0ff;">937</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; first&#39;</span>
</div><div class="line" data-line="24">  <span style="color: #79c0ff;">6673</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; president&#39;</span>
</div><div class="line" data-line="25">   <span style="color: #79c0ff;">310</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; of&#39;</span>
</div><div class="line" data-line="26">   <span style="color: #79c0ff;">278</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; the&#39;</span>
</div><div class="line" data-line="27">  <span style="color: #79c0ff;">8278</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; USA&#39;</span>
</div><div class="line" data-line="28">   <span style="color: #79c0ff;">471</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; was&#39;</span>
</div><div class="line" data-line="29"> <span style="color: #79c0ff;">29871</span> <span style="color: #e6edf3;">-</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #a5d6ff;">&#39; &#39;</span>
</div><div class="line" data-line="30">
</div><div class="line" data-line="31"><span style="color: #d2a8ff;">sampling</span> <span style="color: #e6edf3;">parameters:</span> <span style="color: #e6edf3;">temp</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">0.800000,</span> <span style="color: #e6edf3;">top_k</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">40,</span> <span style="color: #e6edf3;">top_p</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">0.950000</span>
</div><div class="line" data-line="32">
</div><div class="line" data-line="33">
</div><div class="line" data-line="34"><span style="color: #d2a8ff;">The</span> <span style="color: #e6edf3;">first</span> <span style="color: #e6edf3;">president</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">USA</span> <span style="color: #e6edf3;">was</span> <span style="color: #79c0ff;">57</span> <span style="color: #e6edf3;">years</span> <span style="color: #e6edf3;">old</span> <span style="color: #e6edf3;">when</span> <span style="color: #e6edf3;">he</span> <span style="color: #e6edf3;">assumed</span> <span style="color: #e6edf3;">office</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">George</span> <span style="color: #e6edf3;">Washington</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span><span style="color: #d2a8ff;">.</span> <span style="color: #e6edf3;">Nowadays,</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">US</span> <span style="color: #e6edf3;">electorate</span> <span style="color: #e6edf3;">expects</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">president</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">be</span> <span style="color: #e6edf3;">more</span> <span style="color: #e6edf3;">young</span> <span style="color: #e6edf3;">at</span> <span style="color: #e6edf3;">heart.</span> <span style="color: #e6edf3;">President</span> <span style="color: #e6edf3;">Donald</span> <span style="color: #e6edf3;">Trump</span> <span style="color: #e6edf3;">was</span> <span style="color: #79c0ff;">70</span> <span style="color: #e6edf3;">years</span> <span style="color: #e6edf3;">old</span> <span style="color: #e6edf3;">when</span> <span style="color: #e6edf3;">he</span> <span style="color: #e6edf3;">was</span> <span style="color: #e6edf3;">inaugurated.</span> <span style="color: #e6edf3;">In</span> <span style="color: #e6edf3;">contrast</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">his</span> <span style="color: #e6edf3;">predecessors,</span> <span style="color: #e6edf3;">he</span> <span style="color: #e6edf3;">is</span> <span style="color: #e6edf3;">physically</span> <span style="color: #e6edf3;">fit,</span> <span style="color: #e6edf3;">healthy</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">active.</span> <span style="color: #e6edf3;">And</span> <span style="color: #e6edf3;">his</span> <span style="color: #e6edf3;">fitness</span> <span style="color: #e6edf3;">has</span> <span style="color: #e6edf3;">been</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">prominent</span> <span style="color: #e6edf3;">theme</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">his</span> <span style="color: #e6edf3;">presidency.</span> <span style="color: #e6edf3;">During</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">presidential</span> <span style="color: #e6edf3;">campaign,</span> <span style="color: #e6edf3;">he</span> <span style="color: #e6edf3;">famously</span> <span style="color: #e6edf3;">said</span> <span style="color: #e6edf3;">he</span>
</div><div class="line" data-line="35"> <span style="color: #d2a8ff;">would</span> <span style="color: #e6edf3;">be</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">“most</span> <span style="color: #e6edf3;">active</span> <span style="color: #e6edf3;">president</span> <span style="color: #e6edf3;">ever”</span> <span style="color: #e6edf3;">—</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">statement</span> <span style="color: #e6edf3;">Trump</span> <span style="color: #e6edf3;">has</span> <span style="color: #e6edf3;">not</span> <span style="color: #e6edf3;">yet</span> <span style="color: #e6edf3;">achieved,</span> <span style="color: #e6edf3;">but</span> <span style="color: #e6edf3;">one</span> <span style="color: #e6edf3;">that</span> <span style="color: #e6edf3;">fits</span> <span style="color: #e6edf3;">his</span> <span style="color: #e6edf3;">approach</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">office.</span> <span style="color: #e6edf3;">His</span> <span style="color: #e6edf3;">tweets</span> <span style="color: #e6edf3;">demonstrate</span> <span style="color: #e6edf3;">his</span> <span style="color: #e6edf3;">physical</span> <span style="color: #e6edf3;">activity.</span>
</div><div class="line" data-line="36">
</div><div class="line" data-line="37"><span style="color: #d2a8ff;">main:</span> <span style="color: #e6edf3;">mem</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">token</span> <span style="color: #e6edf3;">=</span> <span style="color: #79c0ff;">14434244</span> <span style="color: #e6edf3;">bytes</span>
</div><div class="line" data-line="38"><span style="color: #d2a8ff;">main:</span>     <span style="color: #e6edf3;">load</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">=</span>  <span style="color: #e6edf3;">1311.74</span> <span style="color: #e6edf3;">ms</span>
</div><div class="line" data-line="39"><span style="color: #d2a8ff;">main:</span>   <span style="color: #e6edf3;">sample</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">=</span>   <span style="color: #e6edf3;">278.96</span> <span style="color: #e6edf3;">ms</span>
</div><div class="line" data-line="40"><span style="color: #d2a8ff;">main:</span>  <span style="color: #e6edf3;">predict</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">=</span>  <span style="color: #e6edf3;">7375.89</span> <span style="color: #e6edf3;">ms</span> <span style="color: #e6edf3;">/</span> <span style="color: #e6edf3;">54.23</span> <span style="color: #e6edf3;">ms</span> <span style="color: #e6edf3;">per</span> <span style="color: #e6edf3;">token</span>
</div><div class="line" data-line="41"><span style="color: #d2a8ff;">main:</span>    <span style="color: #e6edf3;">total</span> <span style="color: #e6edf3;">time</span> <span style="color: #e6edf3;">=</span>  <span style="color: #e6edf3;">9216.61</span> <span style="color: #e6edf3;">ms</span>
</div></code></pre>
<p>Enjoy!</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 12 Mar 2023 11:52:01 +0100</pubDate>
    </item>
    <item>
      <title>Martian Llama II</title>
      <link>https://dev.l1x.be/ai/martian-llama-ii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/martian-llama-ii/</guid>
      <content:encoded><![CDATA[<p>Martian space llama by Robert McCall and sergio toppi, grey and orange colours, top view, zoomed out --chaos 88 --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 12 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Martian Llama</title>
      <link>https://dev.l1x.be/ai/martian-llama/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/martian-llama/</guid>
      <content:encoded><![CDATA[<p>Martian space llama by Robert McCall and sergio toppi, grey and orange colours --chaos 88 --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 12 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Cat Drawing</title>
      <link>https://dev.l1x.be/ai/cat-drawing/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/cat-drawing/</guid>
      <content:encoded><![CDATA[<p>cat, illustration, realistic face, japanese, pencil, technical drawing, drawing, 1800, greyscale --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 10 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Logo</title>
      <link>https://dev.l1x.be/ai/logo/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/logo/</guid>
      <content:encoded><![CDATA[<p>Logo without text, simple, vector, isometric, orange and grey, zen by Paul Rand --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 10 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Cyber Cat</title>
      <link>https://dev.l1x.be/ai/cyber-cat/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/cyber-cat/</guid>
      <content:encoded><![CDATA[<p>cyber cat --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Thu, 09 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Firecracker</title>
      <link>https://dev.l1x.be/ai/alien-firecracker/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-firecracker/</guid>
      <content:encoded><![CDATA[<p>firecracker science-fiction alien planet --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Thu, 09 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Hummingbird Schematics</title>
      <link>https://dev.l1x.be/ai/hummingbird-schematics/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/hummingbird-schematics/</guid>
      <content:encoded><![CDATA[<p>Engineering illustration, sketch, technical drawing of a hummingbird --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 08 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Parallel Universe</title>
      <link>https://dev.l1x.be/ai/parallel-universe/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/parallel-universe/</guid>
      <content:encoded><![CDATA[<p>Sketch drawing, geomitry of parallel universe --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 08 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Hexagonal Logo</title>
      <link>https://dev.l1x.be/ai/hexagonal-logo/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/hexagonal-logo/</guid>
      <content:encoded><![CDATA[<p>very simple abstract grey triangle with a hexagon inside, logo, grey and orange, --chaos 99 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 08 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Parallel Universe II</title>
      <link>https://dev.l1x.be/ai/parallel-universe-ii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/parallel-universe-ii/</guid>
      <content:encoded><![CDATA[<p>Sketch drawing, geomitry of parallel universe, mild orange, science fiction equipment --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 08 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Astronaut</title>
      <link>https://dev.l1x.be/ai/astronaut/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/astronaut/</guid>
      <content:encoded><![CDATA[<p>Female Astronaut --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 08 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Parallel Universe IV</title>
      <link>https://dev.l1x.be/ai/parallel-universe-iv/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/parallel-universe-iv/</guid>
      <content:encoded><![CDATA[<p>Sketch drawing, geomitry of parallel universe --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 07 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Parallel Universe III</title>
      <link>https://dev.l1x.be/ai/parallel-universe-iii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/parallel-universe-iii/</guid>
      <content:encoded><![CDATA[<p>Sketch drawing, geomitry of parallel universe, mild orange --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 07 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Meteora</title>
      <link>https://dev.l1x.be/ai/meteora/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/meteora/</guid>
      <content:encoded><![CDATA[<p>I got this prompt from somebody on Midjourney:</p>
<p>Bird view of a beautiful night at Meteora, misty and fog, finely detailed, extremely ornate, octane render, Unreal Engine, Cinematic, Color Grading, Photoshoot, Gamma, White Balance, Neon, Light, Dark, Light Mode, Dark Mode, High Contrast, 5D, Multiverse, 32k, Super - Resolution, Megapixel, ProPhoto RGB, VR, Lonely, Good, Massive, Big, Spotlight, Frontlight, Halfrear Lighting, Backlight, Rim Lights, Rim Lighting, Artificial Lighting, Natural Lighting, Incandescent, Moody Lighting, Cinematic Lighting, Studio Lighting, Soft Lighting, Hard Lighting, volumetric Light, Volumetric Lighting, Volumetric, Contre - Jour, Lighting, Split Lighting, Beautiful Lighting, Global Illumination, Lumen Global Illumination, Screen Space Global Illumination, Ray Tracing Global Illumination, Materiality, Ambient Occlusion, Scattering, Glowing, Shadows, Rough, Ray Tracing Reflections, Lumen Reflections, Screen Space Reflections, Diffraction Grading, Chromatic Aberration, Scan Lines, Ray Traced, Ambient Occlusion, Anti - Aliasing, FXAA, TXAA, RTX, SSAO, Shaders, OpenGL - Shaders, GLSL - Shaders, Post Processing, Post - Production, Cel Shading, Tone Mapping, CGI, VFX, SFX, insanely detailed and intricate, hypermaximalist, elegant, hyper realistic, super detailed, photography, highly detailed and realistic, photoreal, Ultrarealistic, cinematography, hyper detailed, absolute realism, cinematic scene cinematic lighting, hyper resolution, Perfectionism, angelical, elegant, concept art, ultra - detailed, hypermaximalist, Epic, Photorealism, Very high detail, shot on professional camera, cinematic lighting, soft lighting, intricate details, depth of field, photography, cinematic shot, incredibly detailed, 4K. 8K, 16K --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 07 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Tech Drawing II</title>
      <link>https://dev.l1x.be/ai/tech-drawing-ii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/tech-drawing-ii/</guid>
      <content:encoded><![CDATA[<p>Star wars vehicles technical drawing, bright, light, white and grey, industrial scifi, greyscale --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 07 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Portrait III</title>
      <link>https://dev.l1x.be/ai/portrait-iii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/portrait-iii/</guid>
      <content:encoded><![CDATA[<p>Folies Bergere La Revue de l’Amour, photorealistic, Klimt, high resolution, 8k --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet VII</title>
      <link>https://dev.l1x.be/ai/alien-planet-vii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-vii/</guid>
      <content:encoded><![CDATA[<p>Space ninja exitting a stargate on an alien planet --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet III</title>
      <link>https://dev.l1x.be/ai/alien-planet-iii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-iii/</guid>
      <content:encoded><![CDATA[<p>a city for near contemporary terrestrial civilization with an energy capability equivalent to the solar insolation on Earth --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Space Station</title>
      <link>https://dev.l1x.be/ai/space-station/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/space-station/</guid>
      <content:encoded><![CDATA[<p>space station with nuclear power plant, many smaller spaceships docking, high-tech, galaxies in distance, scenery, hi-resolution, 8k, photography, detailed, grey and orange colours --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet II</title>
      <link>https://dev.l1x.be/ai/alien-planet-ii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-ii/</guid>
      <content:encoded><![CDATA[<p>an aquatic alien planet, futuristic underwater city, under a lake, humanoid guppy explorers swim towards the city, sunset, by Syd Mead and James Jean and Jeremy Mann, Superimposed with a stained glass landscape by Tiffany and Erin Hanson and Joseph Zbukvic --ar 1:2 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet</title>
      <link>https://dev.l1x.be/ai/alien-planet/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet/</guid>
      <content:encoded><![CDATA[<p>an aquatic alien planet, futuristic underwater city, under a lake, humanoid guppy explorers swim towards the city, sunset, by Syd Mead and James Jean and Jeremy Mann, Superimposed with a stained glass landscape by Tiffany and Erin Hanson and Joseph Zbukvic --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet IV</title>
      <link>https://dev.l1x.be/ai/alien-planet-iv/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-iv/</guid>
      <content:encoded><![CDATA[<p>discover an Advanced Civilization, Kardashev Type 3 Civilization, super-sophisticated buildings that blend with the surrounding forest, you can see the people in the park, spaceships passing by, you can also see the sky ::1 planets, stars, to the outer space post --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Portrait</title>
      <link>https://dev.l1x.be/ai/portrait/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/portrait/</guid>
      <content:encoded><![CDATA[<p>woman with a dragon tatoo on her chest, photorealistic, high resolution, 35mm, f1.8, 8k, Helmut Newton --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet V</title>
      <link>https://dev.l1x.be/ai/alien-planet-v/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-v/</guid>
      <content:encoded><![CDATA[<p>discover an Advanced Civilization, Kardashev Type 3 Civilization, super-sophisticated buildings that blend with the surrounding forest, you can see the people in the park, spaceships passing by, you can also see the sky ::1 planets, stars, to the outer space post --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Portrait II</title>
      <link>https://dev.l1x.be/ai/portrait-ii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/portrait-ii/</guid>
      <content:encoded><![CDATA[<p>woman with a dragon tatoo on her chest, photorealistic, high resolution, 35mm, f1.8, 8k, Helmut Newton --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet VI</title>
      <link>https://dev.l1x.be/ai/alien-planet-vi/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-vi/</guid>
      <content:encoded><![CDATA[<p>discover an Advanced Civilization, Kardashev Type 3 Civilization, super-sophisticated buildings that blend with the surrounding forest, you can see the people in the park, spaceships passing by, you can also see the sky ::1 planets, stars, to the outer space post --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Fantasy Woman</title>
      <link>https://dev.l1x.be/ai/fantasy-woman/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/fantasy-woman/</guid>
      <content:encoded><![CDATA[<p>fantasy, front side view ,torso shot, dreamcatcher, moon ,glowing stardust, hot shapely young cute sandgirl yo 20, beautiful long hair,random dynamic Pose,wearing white lace cropped tanktop and skirt ornaments, powerful, strong,league of legends,gorgeous face, photorealistic, hyperrealistic, dramatic, extreme composition, realistic, cinematic lighting, photography lighting, lightroom gallery, behance photography, professional photo ::5 oil painting ::0 ,5 nebular ::3 mystic ::1 2 anime ::0 ,5 --ar 16:9 --quality 2 --uplight --stylize 250 --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Woman</title>
      <link>https://dev.l1x.be/ai/alien-woman/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-woman/</guid>
      <content:encoded><![CDATA[<p>Hypnotic visual aberration, a sci-fi epic full body portrait of a future alien woman sitting in a yoga pose with supple curves and a soft glow, sitting on a becnh, city in the background --ar 16:9 --quality 2 --uplight --v 4</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Space Station II</title>
      <link>https://dev.l1x.be/ai/space-station-ii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/space-station-ii/</guid>
      <content:encoded><![CDATA[<p>space station with nuclear power plant, many smaller space ships docking, high-tech, planet in distance, scenery, hi-resolution, 8k, photography, detailed, grey and orange colours --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Alien Planet VIII</title>
      <link>https://dev.l1x.be/ai/alien-planet-viii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/alien-planet-viii/</guid>
      <content:encoded><![CDATA[<p>Alien desert planet scenery with a city in the background, robotic orange space serpent, as high-tech alien equipment on the groud, wide, science fiction, futuristic space suit with technical details, photography, high resolution, high-fidelity, detailed, light background, zen, hexagon, stargate, flames --chaos 55 --ar 16:9 --seed 42 --stylize 44 --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 06 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Space Station IV</title>
      <link>https://dev.l1x.be/ai/space-station-iv/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/space-station-iv/</guid>
      <content:encoded><![CDATA[<p>military ships approaching docking bay of a space station --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sat, 04 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Space Station III</title>
      <link>https://dev.l1x.be/ai/space-station-iii/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/space-station-iii/</guid>
      <content:encoded><![CDATA[<p>wide angle. futuristic high-tech pyramid space station at large distance shot ::4 floating in space ::2 , smaller high-tech alien warships ::1 are approaching for docking, space, orange and grey, science-fiction, photography, detailed, high-resolution --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sat, 04 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Space Station V</title>
      <link>https://dev.l1x.be/ai/space-station-v/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/space-station-v/</guid>
      <content:encoded><![CDATA[<p>space station with nuclear power plant, many smaller space ships docking, high-tech, planets in distance, scenery, hi-resolution, 8k, photography, detailed, grey and orange colours --ar 16:9 --quality 2 --uplight --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sat, 04 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Mushroom Forest</title>
      <link>https://dev.l1x.be/ai/mushroom-forest/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/mushroom-forest/</guid>
      <content:encoded><![CDATA[<p>Matsuri festival in a fantasy mushroom forest with smiling felines, field of view, realistic, high resolution, photorealistic --ar 16:9 --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 01 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Riding Hood</title>
      <link>https://dev.l1x.be/ai/riding-hood/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/riding-hood/</guid>
      <content:encoded><![CDATA[<p>Smiling Little Red Riding Hood in a mushroom forest, bright colors, day light, point of view, tropical, realistic , 8k, fantasy --ar 16:9 --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 01 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Android</title>
      <link>https://dev.l1x.be/ai/android/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/android/</guid>
      <content:encoded><![CDATA[<p>moderne retro futuristic female android model, with abstract molten orange wig, marionette lines, robot face panels, stunning beautiful white plastic skin, robotic lense eyes, beautiful holographic led mouth, retro vivid background of circles and retro patterns --ar 16:9 --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 01 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Zelda</title>
      <link>https://dev.l1x.be/ai/zelda/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/zelda/</guid>
      <content:encoded><![CDATA[<p>Father and son walking on a flowery field in Zelda, photorealistic, 8k, field of view, high resolution --ar 16:9 --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 01 Mar 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Using Python 3.11 with AWS Lambda</title>
      <link>https://dev.l1x.be/posts/2023/02/28/using-python-3.11-with-aws-lambda/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/02/28/using-python-3.11-with-aws-lambda/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/python_lambda.webp" alt="Python Lambda" /></p>
<h2 id="a-python-problem"><a href="#a-python-problem">A Python Problem</a></h2>
<p>AWS Lambda has a Python offering that is limited to version 3.9. It is challenging to use that version due to many things, but primarily because package maintainers do not necessarily cater to users who use older versions. And there is more. When you work with Python, what you must realize is that you usually have C, C++, Rust, or libc dependencies not to mention CPU architectures. The most useful Python libraries are written in C, C++, Fortran, or Rust. So when you try to deploy your code to a less used platform, it can burst into flames in the worst kind of ways.</p>
<p>I have spent more hours trying to get some Python lib work on a platform than learning Rust.</p>
<p>The last nail in the coffin of trying to use Python 3.9 on AWS was when a library was using Rust that was compiled against a newer version of libc that AWS has. That triggered the creation of a new solution for deploying Python to AWS Lambda so that we do not need to care about these issues anymore.</p>
<h2 id="docker-for-the-rescue"><a href="#docker-for-the-rescue">Docker for the Rescue</a></h2>
<p>By using a Docker image as the source for your Lambda, you can simplify your deployment. Docker allows you to package your application and its dependencies into a single container. In addition, you can deploy your entire application stack with a single command, making the deployment process simpler.</p>
<p>Docker provides a consistent environment for your application to run in, regardless of the underlying infrastructure. By using a Docker image as the source for your AWS Lambda function, you can ensure that your application will always run in the same environment, regardless of where it is deployed. Using a Docker image ensures that your application stack, including the operating system, libraries, and dependencies, is consistent across all environments. This consistency ensures that your application will behave the same way regardless of where it is deployed, which reduces the risk of errors or unexpected behavior.</p>
<h2 id="how-to-do-it"><a href="#how-to-do-it">How to Do It</a></h2>
<ol>
<li>Create, build, and upload a base image with the required Python version</li>
<li>Create, build, and upload an application-specific image</li>
<li>Set up and deploy the Lambda function with Terraform</li>
</ol>
<h3 id="base-dockerfile"><a href="#base-dockerfile">Base Dockerfile</a></h3>
<p>This Dockerfile constitutes our base image from which we build our application-specific image. This provides a stable environment for the app. To make our end image smaller, we start a multi-phase build. We specify the Python version we require for our runtime in the FROM lines. Then, we install the AWS Lambda Runtime Interface Client to guarantee communication between the Lambda environment and our code. In the second phase, we only need to copy the runtime interface client to the completed image.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">ARG FUNCTION_DIR=&quot;/var/task&quot;
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">FROM python:3.11.2-slim-bullseye as build-image
</div><div class="line" data-line="4">ARG FUNCTION_DIR
</div><div class="line" data-line="5">
</div><div class="line" data-line="6">RUN mkdir -p $&lbrace;FUNCTION_DIR&rbrace; &amp;&amp; mkdir -p /venv
</div><div class="line" data-line="7">RUN useradd -m -u 5000 lambda || :
</div><div class="line" data-line="8">RUN chown lambda $&lbrace;FUNCTION_DIR&rbrace; &amp;&amp; chown lambda /venv
</div><div class="line" data-line="9">RUN apt-get update &amp;&amp; \
</div><div class="line" data-line="10">  apt-get install -y --no-install-recommends \
</div><div class="line" data-line="11">  g++ \
</div><div class="line" data-line="12">  make \
</div><div class="line" data-line="13">  cmake \
</div><div class="line" data-line="14">  unzip \
</div><div class="line" data-line="15">  libcurl4-openssl-dev &amp;&amp; \
</div><div class="line" data-line="16">  apt-get clean &amp;&amp; \
</div><div class="line" data-line="17">  rm -rf /var/lib/apt/lists/*
</div><div class="line" data-line="18">USER lambda
</div><div class="line" data-line="19">RUN python -m venv /venv
</div><div class="line" data-line="20">ENV PATH=&quot;/venv/bin:$PATH&quot;
</div><div class="line" data-line="21">RUN which pip
</div><div class="line" data-line="22">RUN pip install pip --upgrade
</div><div class="line" data-line="23">RUN pip install awslambdaric
</div><div class="line" data-line="24">
</div><div class="line" data-line="25">FROM python:3.11.2-slim-bullseye
</div><div class="line" data-line="26">RUN mkdir -p /venv
</div><div class="line" data-line="27">RUN useradd -m -u 5000 lambda || :
</div><div class="line" data-line="28">RUN chown lambda /venv
</div><div class="line" data-line="29">USER lambda
</div><div class="line" data-line="30">COPY --from=build-image /venv /venv
</div><div class="line" data-line="31">ENV PATH=&quot;/venv/bin:$PATH&quot;
</div><div class="line" data-line="32">RUN pip list
</div><div class="line" data-line="33">ENTRYPOINT [ &quot;python&quot;, &quot;-m&quot;, &quot;awslambdaric&quot; ]
</div></code></pre>
<p>As you can see we use a user instead of using root for running our app. Running your application as a non-root user is recommended. In the context of AWS Lambda security is probably not that big of a deal but following the principle of least privilege (PoLP) is a good idea.</p>
<p>We use Ninja for building and uploading images. You can find more about this here: <a href="https://dev.l1x.be/posts/2023/02/21/misusing-ninja/">Misusing Ninja</a>. After being built, the base image is uploaded to AWS ECR, so it can be fetched for the application-specific image. This is our build.ninja file:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">aws_account_id</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">xxxxxxxxxx</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">python_version</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">3.11.2</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">repo</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">my-repo</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">login-to-ecr</span>
</div><div class="line" data-line="6">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">aws</span> <span style="color: #e6edf3;">ecr</span> <span style="color: #e6edf3;">get-login-password</span> <span style="color: #e6edf3;">--region</span> <span style="color: #e6edf3;">eu-west-1</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">login</span> <span style="color: #e6edf3;">--username</span> <span style="color: #e6edf3;">AWS</span> <span style="color: #e6edf3;">--password-stdin</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">aws_account_id</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">.dkr.ecr.eu-west-1.amazonaws.com</span>
</div><div class="line" data-line="7">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Logging</span> <span style="color: #e6edf3;">into</span> <span style="color: #e6edf3;">ECR</span>
</div><div class="line" data-line="8">
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">build-aws-lambda-base</span>
</div><div class="line" data-line="10">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span> <span style="color: #e6edf3;">--file</span> <span style="color: #e6edf3;">Dockerfile.aws_lambda_python_</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="11">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Building</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="12">
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">tag-aws-lamda-base</span>
</div><div class="line" data-line="14">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">tag</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">aws_account_id</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">.dkr.ecr.eu-west-1.amazonaws.com/</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="15">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Tagging</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="16">
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">push-aws-lamda-base</span>
</div><div class="line" data-line="18">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">push</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">aws_account_id</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">.dkr.ecr.eu-west-1.amazonaws.com/</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="19">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Push</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:lambda-py-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">python_version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="20">
</div><div class="line" data-line="21"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">login-to-ecr:</span> <span style="color: #e6edf3;">login-to-ecr</span>
</div><div class="line" data-line="22"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">build-aws-lambda-base:</span> <span style="color: #e6edf3;">build-aws-lambda-base</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">login-to-ecr</span>
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">tag-aws-lamda-base:</span> <span style="color: #e6edf3;">tag-aws-lamda-base</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">build-aws-lambda-base</span>
</div><div class="line" data-line="24"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">push-aws-lamda-base:</span> <span style="color: #e6edf3;">push-aws-lamda-base</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">tag-aws-lamda-base</span>
</div><div class="line" data-line="25">
</div><div class="line" data-line="26"><span style="color: #d2a8ff;">default</span> <span style="color: #e6edf3;">login-to-ecr</span> <span style="color: #e6edf3;">build-aws-lambda-base</span> <span style="color: #e6edf3;">tag-aws-lamda-base</span> <span style="color: #e6edf3;">push-aws-lamda-base</span>
</div></code></pre>
<p>Let's run it:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">ninja</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">build.ninja</span>
</div></code></pre>
<p>We have the base image in ECR. This gives us control over what goes into the application image later. For immutable infrastructure it is paramount to use a base image that is not a moving target, meaning you do not run apt-get update or a similar command every time you build an application-specific container.</p>
<h3 id="application-specific-dockerfile"><a href="#application-specific-dockerfile">Application-Specific Dockerfile</a></h3>
<p>This is the Dockerfile for the application that runs inside the Lambda function as a container. The base image, that has just been created, is referenced on the first line.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM &lt;aws_account_id&gt;.dkr.ecr.eu-west-1.amazonaws.com/&lt;repo&gt;:lambda-py-3.11.2
</div><div class="line" data-line="2">ARG FUNCTION_DIR=&quot;/var/task&quot;
</div><div class="line" data-line="3">ENV PATH=&quot;/venv/bin:$PATH&quot;
</div><div class="line" data-line="4">USER lambda
</div><div class="line" data-line="5">WORKDIR $&lbrace;FUNCTION_DIR&rbrace;
</div><div class="line" data-line="6">ADD app .
</div><div class="line" data-line="7">COPY pyproject.toml .
</div><div class="line" data-line="8">COPY README.md .
</div><div class="line" data-line="9">RUN pip install .
</div><div class="line" data-line="10">RUN pip list
</div><div class="line" data-line="11">CMD [&quot;app.handler&quot;]
</div></code></pre>
<p>In this file when we install the python packages we usually lock the versions to a specific one, again, we would like to have immutable infra, running the build on different computers should result in the same image.</p>
<p>Our directory looks like the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">|__app/
</div><div class="line" data-line="2">| |__app.py
</div><div class="line" data-line="3">| |__pyproject.toml
</div><div class="line" data-line="4">| |__README.MD
</div><div class="line" data-line="5">|
</div><div class="line" data-line="6">|__Dockerfile
</div><div class="line" data-line="7">|__build.ninja
</div></code></pre>
<p>If you have a more complicated application there are many more files in the app folder.</p>
<p>Our entry point is the function called handler that is inside the app.py. CMD specifies where is the Lambda handler and will be used by the ENTRYPOINT of the base image.</p>
<p>This image is built similarly to the base image: with Ninja. This is our build.ninja file for the app-specific image:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">version</span>   <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">0.6.1</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">aws_account_id</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">xxxxxxxx</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">repo</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">my-repo</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">login-to-ecr</span>
</div><div class="line" data-line="7">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">aws</span> <span style="color: #e6edf3;">ecr</span> <span style="color: #e6edf3;">get-login-password</span> <span style="color: #e6edf3;">--region</span> <span style="color: #e6edf3;">eu-west-1</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">docker</span> <span style="color: #e6edf3;">login</span> <span style="color: #e6edf3;">--username</span> <span style="color: #e6edf3;">AWS</span> <span style="color: #e6edf3;">--password-stdin</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">aws_account_id</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">.dkr.ecr.eu-west-1.amazonaws.com</span>
</div><div class="line" data-line="8">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Logging</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">ECR</span>
</div><div class="line" data-line="9">
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">build-image</span>
</div><div class="line" data-line="11">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span> <span style="color: #e6edf3;">--file</span> <span style="color: #e6edf3;">Dockerfile</span>
</div><div class="line" data-line="12">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Building</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="13">
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">tag-image</span>
</div><div class="line" data-line="15">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">tag</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">aws_account_id</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">.dkr.ecr.eu-west-1.amazonaws.com/</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="16">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Tagging</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="17">
</div><div class="line" data-line="18"><span style="color: #d2a8ff;">rule</span> <span style="color: #e6edf3;">upload-image</span>
</div><div class="line" data-line="19">  <span style="color: #d2a8ff;">command</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">push</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">aws_account_id</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">.dkr.ecr.eu-west-1.amazonaws.com/</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="20">  <span style="color: #d2a8ff;">description</span> <span style="color: #e6edf3;">=</span> <span style="color: #e6edf3;">Push</span> <span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">repo</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">:backend-api-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="21">
</div><div class="line" data-line="22">
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">login-to-ecr:</span> <span style="color: #e6edf3;">login-to-ecr</span>
</div><div class="line" data-line="24"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">build-image:</span> <span style="color: #e6edf3;">build-image</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">login-to-ecr</span>
</div><div class="line" data-line="25"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">tag-image:</span> <span style="color: #e6edf3;">tag-image</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">build-image</span>
</div><div class="line" data-line="26"><span style="color: #d2a8ff;">build</span> <span style="color: #e6edf3;">upload-image:</span> <span style="color: #e6edf3;">upload-image</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">tag-image</span>
</div><div class="line" data-line="27">
</div><div class="line" data-line="28"><span style="color: #d2a8ff;">default</span> <span style="color: #e6edf3;">login-to-ecr</span> <span style="color: #e6edf3;">build-image</span> <span style="color: #e6edf3;">tag-image</span> <span style="color: #e6edf3;">upload-image</span>
</div></code></pre>
<p>Let's run it:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">ninja</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">build.ninja</span>
</div></code></pre>
<p>Now there is an image in ECR that hosts our application. You can very easily move back and forth between different base image versions, try out new libraries, or update your application. Because we deploy the same image to dev and prod you can gain confidence about the version that you are working on as it passes through different stages from local to dev and finally to prod.</p>
<p>I have left out testing and formatting, linting from the build process because it would be too much to display the complete build file with those steps. We use pytest, Ruff, and Black for most of these tasks. We also implemented integration tests in pytest for more thorough testing.</p>
<h3 id="lambda-function-with-terraform"><a href="#lambda-function-with-terraform">Lambda Function with Terraform</a></h3>
<p>Our final task is to deploy to AWS Lambda. We are using Terraform to achieve this. The package_type must be Image and the image URI needs to be provided. We take adventage of ARM64 with this setup which is a bit cheaper than running on AMD64.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">resource &quot;aws_lambda_function&quot; &quot;docker-lambda-function&quot; &lbrace;
</div><div class="line" data-line="2">  function_name = &quot;my-lambda-function&quot;
</div><div class="line" data-line="3">  description   = &quot;This is my lambda function&quot;
</div><div class="line" data-line="4">  image_uri     = &quot;&lt;account_id&gt;.dkr.ecr.eu-west-1.amazonaws.com/&lt;repo&gt;:backend-api-&lt;lambda-function-version&gt;&quot;
</div><div class="line" data-line="5">  package_type  = &quot;Image&quot;
</div><div class="line" data-line="6">  role          = &lt;lambda_role_arn&gt;
</div><div class="line" data-line="7">  memory_size   = 2048
</div><div class="line" data-line="8">  timeout       = 10
</div><div class="line" data-line="9">  architectures = [&quot;arm64&quot;]
</div><div class="line" data-line="10">&rbrace;
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">After deploying to AWS Lambda we also create a CloudFront distribution and API Gateway v2 api that fronts Lambda. The new function-url makes it possible to skip API Gateway. Maybe later on we can have a look what are the tradeoffs with that setup.
</div><div class="line" data-line="13">
</div><div class="line" data-line="14">
</div><div class="line" data-line="15">## Conclusion
</div><div class="line" data-line="16">
</div><div class="line" data-line="17">Using a Docker image as the packaging for an AWS Lambda function provides a consistent, immutable environment for your application, ensuring that it runs the same way regardless of where it is deployed. This can help reduce errors and unexpected behavior and make it easier to package and distribute your application and its dependencies. Python 3.11 is much faster in many cases than previous versions of Python.
</div><div class="line" data-line="18">
</div></code></pre>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 28 Feb 2023 19:32:01 +0100</pubDate>
    </item>
    <item>
      <title>Atlantis</title>
      <link>https://dev.l1x.be/ai/atlantis/</link>
      <guid isPermaLink="true">https://dev.l1x.be/ai/atlantis/</guid>
      <content:encoded><![CDATA[<p>civilization that built the pyramids had knowledge of astronomy and celestial mechanics, city of Atlantis on Earth 60000 years ago, civilization at peak, people wearing Horus mask walk around in white tunic, in the background a pyramid is oearating as a power plant receiving focused sun light from space, wide angle, very detailed, wide, ultra realistic science fiction 8k, cinematic, photography, depth of field, high resolution, optics, scattering, glowing, shadows, hyper realistic --ar 16:9 --seed 88 --v 4</p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sat, 25 Feb 2023 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Beyond the Borrow Checker</title>
      <link>https://dev.l1x.be/posts/2023/02/22/beyond-the-borrow-checker/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/02/22/beyond-the-borrow-checker/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/rust_etl.webp" alt="Rust for ETL" /></p>
<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p>In this blog post, we explore the reasons why Rust is an excellent choice for Extract, Transform, Load (ETL) jobs. I have been writing ETL jobs for small and big companies in Java, Scala, and Python for over a decade. Python is an obvious winner of this segment and its popularity shows up in StackOverflow surveys. Because of its popularity there is unlimited content on the internet on how to do certain tasks in Python and you can get help on the usual platforms very easily. However, Python has some pretty big downsides.</p>
<p>First of all, being a dynamic interpreted language makes it pretty hard to trust your code. The chance that it works for you and fails in production is non-trivial. Docker and other methods help you with this but why would you need to use another tool just to have reliable deployments? I have wasted a lot of hours on Python deployment issues and it takes a lot to have a reliable setup that smaller teams usually don't have.</p>
<p>This is where Rust comes into the picture.</p>
<p>Rust's unique features like static cross-platform building, a single way to configure projects, simple dependency management, memory safety, and built-in support for serialization and deserialization make it an attractive option for data processing tasks. The borrow checker ensures that memory is managed correctly, making it almost impossible to write code with memory errors, resulting in reliable and safe ETL jobs.</p>
<p>We usually run ETL jobs on execution platforms like k8s with Airflow or AWS Glue with Spark. These platforms are usually pretty slow and inefficient for smaller tasks and do not yield great performance out of the box. Companies end up hiring experts who can fine-tune these systems to specific workloads.</p>
<p>If you're looking for a modern and efficient alternative for your ETL jobs, Rust is worth considering.</p>
<h2 id="why-on-earth-would-anybody-use-rust-for-etl"><a href="#why-on-earth-would-anybody-use-rust-for-etl">Why on Earth would anybody use Rust for ETL?</a></h2>
<ul>
<li>Static cross-platform building</li>
</ul>
<p>Rust's cross-platform building feature allows developers to create binary executables that can run on different operating systems. This is because Rust produces a single binary that can be deployed across multiple platforms. This feature makes it easy to write ETL jobs that can be run on different systems.</p>
<ul>
<li>Single way to configure your project</li>
</ul>
<p>Rust has a great approach to configuring projects. Instead of relying on a variety of configuration files, Rust uses a single configuration file, Cargo.toml, to configure the project. You can, of course, try to add more configuration files if you want but the official way is Cargo.toml. Python has at least three different ways. This makes it easy to manage and maintain the project's dependencies and configurations.</p>
<ul>
<li>Handling dependency versions are simple</li>
</ul>
<p>One of the most significant challenges in developing software is managing dependencies. Rust solves this problem by using the Cargo package manager. Cargo allows developers to declare dependencies and their versions in the Cargo.toml file. Cargo then downloads and installs the required dependencies, ensuring that they work together correctly. While Python has many issues with different operating system dependencies and Python versions, Rust does not. I have seen a problem once where a dependency had some issues but it was trivial to fix and the workaround was obvious. I cannot say the same about Python where I run into problems with libraries not supporting a certain combination of Python and operating system.</p>
<ul>
<li>It is almost impossible to write broken libraries</li>
</ul>
<p>Rust has a unique feature that ensures that developers write the correct code. The borrow checker ensures that memory is managed correctly, making it almost impossible to write code with memory errors. On top of that the strict type system makes it trivial to check type safety across the code and this makes it easy to write reliable and safe ETL jobs.</p>
<ul>
<li>Serialization and deserialization are simple</li>
</ul>
<p>Rust has built-in support for the serialization and deserialization of data. This feature makes it easy to read and write data in various formats, making it an excellent choice for ETL jobs.</p>
<ul>
<li>Async makes parallel and concurrent workflows easy to write</li>
</ul>
<p>Python also has some async capabilities but I like Rust a bit more. I think the people who designed Rust async did a great job from a practical point of view. I know it could be better and I am aware of its largely theoretical criticism.</p>
<p>Ok, so let us have a look at a side-by-side comparison of writing a medium size ETL job in both languages.</p>
<h3 id="types"><a href="#types">Types</a></h3>
<p>First, let's create a type in each language to represent the problem at hand. We would like to create a lookup structure with two ways lookup and store strings and hashes in it.</p>
<ul>
<li>Python</li>
</ul>
<p>Just to have something easy to work with you need to use a library in Python. It has class-based higher-level types which are a bit harder to use in practice than the Rust equivalent.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">from</span> <span style="color: #e6edf3;">pydantic</span> <span style="color: #ff7b72;">import</span> <span style="color: #ffa657;">BaseModel</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">class</span> <span style="color: #ffa657;">EidCache</span>(<span style="color: #ffa657;">BaseModel</span>):
</div><div class="line" data-line="3">    <span style="color: #e6edf3;">forward_dict</span>: <span style="color: #ffa657;">Dict</span>[<span style="color: #ffa657;">str</span>, <span style="color: #ffa657;">str</span>]
</div><div class="line" data-line="4">    <span style="color: #e6edf3;">reverse_dict</span>: <span style="color: #ffa657;">Dict</span>[<span style="color: #ffa657;">str</span>, <span style="color: #ffa657;">str</span>]
</div><div class="line" data-line="5">    <span style="color: #e6edf3;">created_at</span>: <span style="color: #ffa657;">Optional</span>[<span style="color: #ffa657;">str</span>]
</div></code></pre>
<ul>
<li>Rust</li>
</ul>
<p>The Rust version is roughly the same.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">#</span><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">derive</span><span style="color: #e6edf3;">(</span><span style="color: #ffa657;">Serialize</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Deserialize</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Debug</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">pub</span> <span style="color: #ff7b72;">struct</span> <span style="color: #ffa657;">EidCache</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">    <span style="color: #ff7b72;">pub</span> <span style="color: #79c0ff;">forward_dict</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">BTreeMap</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">    <span style="color: #ff7b72;">pub</span> <span style="color: #79c0ff;">reverse_dict</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">BTreeMap</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">    <span style="color: #79c0ff;">created_at</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<h3 id="code"><a href="#code">Code</a></h3>
<p>This is my first Rust code, so go easy on me. :) I am not sure how to reduce the memory allocation, maybe use references instead of creating a new string every time.</p>
<ul>
<li>Python</li>
</ul>
<p>Python is a bit easier to write if you are already familiar with dictionary comprehension.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">get_reverse_dictionary</span>(<span style="color: #e6edf3;">d</span>: <span style="color: #ffa657;">Dict</span>[<span style="color: #ffa657;">str</span>, <span style="color: #ffa657;">str</span>]):
</div><div class="line" data-line="2">    <span style="color: #ff7b72;">return</span> &lbrace;<span style="color: #e6edf3;">v</span>: <span style="color: #e6edf3;">k</span> <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">k</span>, <span style="color: #e6edf3;">v</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">d</span>.<span style="color: #79c0ff;">items</span>()&rbrace;
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">
</div><div class="line" data-line="5"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">create_db_eid_cache</span>():
</div><div class="line" data-line="6">    <span style="color: #e6edf3;">database_names</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">get_all_database_names</span>()
</div><div class="line" data-line="7">    <span style="color: #e6edf3;">forward_dict</span> <span style="color: #79c0ff;">=</span> &lbrace;<span style="color: #e6edf3;">name</span>: <span style="color: #d2a8ff;">get_db_eid</span>(<span style="color: #e6edf3;">name</span>) <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">name</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">database_names</span>&rbrace;
</div><div class="line" data-line="8">    <span style="color: #e6edf3;">reverse_dict</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">get_reverse_dictionary</span>(<span style="color: #e6edf3;">forward_dict</span>)
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">    <span style="color: #ff7b72;">return</span> <span style="color: #d2a8ff;">EidCache</span>(
</div><div class="line" data-line="11">        <span style="color: #e6edf3;">forward_dict</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">forward_dict</span>, <span style="color: #e6edf3;">reverse_dict</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">reverse_dict</span>, <span style="color: #e6edf3;">created_at</span><span style="color: #79c0ff;">=</span><span style="color: #d2a8ff;">iso_utc_now</span>()
</div><div class="line" data-line="12">    )
</div></code></pre>
<ul>
<li>Rust</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">create_eid_cache</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">F</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">names</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Option</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">eid_fun</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">F</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Option</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">EidCache</span><span style="color: #e6edf3;">&gt;</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">where</span>
</div><div class="line" data-line="3">    <span style="color: #ffa657;">F</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Fn</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">&amp;</span><span style="color: #ff7b72;">str</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="5">    <span style="color: #ff7b72;">match</span> <span style="color: #e6edf3;">names</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="6">        <span style="color: #79c0ff;">Some</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">xs</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">=&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="7">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">xs_with_eid</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Vec</span><span style="color: #e6edf3;">&lt;</span><span style="color: #e6edf3;">(</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span>
</div><div class="line" data-line="8">                <span style="color: #e6edf3;">xs</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">map</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">to_string</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #d2a8ff;">eid_fun</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">collect</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">            <span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">forward_dict</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">BTreeMap</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">BTreeMap</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="11">            <span style="color: #ff7b72;">let</span> <span style="color: #ff7b72;">mut</span> <span style="color: #e6edf3;">reverse_dict</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">BTreeMap</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">String</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">String</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">BTreeMap</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">new</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">            <span style="color: #e6edf3;">xs_with_eid</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">iter</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">for_each</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">|</span><span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">|</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="14">                <span style="color: #e6edf3;">forward_dict</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">insert</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">clone</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">1</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">clone</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="15">                <span style="color: #e6edf3;">reverse_dict</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">insert</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">1</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">clone</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">x</span><span style="color: #e6edf3;">.</span><span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">clone</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="16">                <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="17">            <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="18">
</div><div class="line" data-line="19">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">created_at</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">OffsetDateTime</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">now_utc</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">format</span><span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">&amp;</span><span style="color: #ffa657;">Iso8601</span><span style="color: #e6edf3;">::</span><span style="color: #79c0ff;">DEFAULT</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">unwrap</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="20">
</div><div class="line" data-line="21">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">request_count</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="22">
</div><div class="line" data-line="23">            <span style="color: #79c0ff;">Some</span><span style="color: #e6edf3;">(</span><span style="color: #ffa657;">EidCache</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="24">                <span style="color: #79c0ff;">forward_dict</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="25">                <span style="color: #79c0ff;">reverse_dict</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="26">                <span style="color: #79c0ff;">created_at</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="27">                <span style="color: #79c0ff;">request_count</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="28">            <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="29">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="30">        <span style="color: #79c0ff;">None</span> <span style="color: #e6edf3;">=&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="31">            <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">msg</span> <span style="color: #79c0ff;">=</span> <span style="color: #a5d6ff;">&quot;There are no name entries&quot;</span><span style="color: #e6edf3;">.</span><span style="color: #d2a8ff;">to_string</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="32">            <span style="color: #79c0ff;">info</span><span style="color: #d2a8ff;">!</span><span style="color: #e6edf3;">(</span><span style="color: #a5d6ff;">&quot;&lbrace;&rbrace;&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">msg</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="33">            <span style="color: #79c0ff;">None</span>
</div><div class="line" data-line="34">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="35">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="36"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>As you can see the codes are comparable in length, the Rust version being a bit longer.</p>
<p>The whole project is a 1150 lines of Rust and 687 lines of Python. It involves talking to different AWS APIs, calculating checksums and write summary data to S3. Because it is easy to use multiple files in Rust we split up the code into smaller more managable chunks and compile it to a single binary before deploying the function to Lambda. With Python we need to zip up the files and go through all sorts of hoops because AWS supports only Python 3.7 or 3.10 depending on which Glue version is available. The dependency management is much easier with Rust, we simple need to use the Cargo.toml for it while in the Python case you have to provide your dependencies which get installed at run time adding more time to the already slow execution.</p>
<h3 id="performance"><a href="#performance">Performance</a></h3>
<p>Our Python code runs around 2 minutes and 30 seconds (p50) because it has to start up the Spark cluster and do lots of things that we do not need. I was trying to understand how would it look like if we move this workload out of Glue into Lambda and used Rust instead of Python. I could have used Python on Lambda as well and I am going to write an article about that experience next. Anyways, our first attempt with Rust was extremely smooth thanks to cargo-lambda and the language tooling. The p50 value for the AWS Lambda function with Rust is 1,3 seconds. Yes this is an apples to oranges comparison.</p>
<p>I think that is a pretty big improvement that we cannot ignore. As far as developer productivity goes: yes you can write Python faster than Rust but you are going to pay the penalty later at deployment time and when you run in production.</p>
<h2 id="summary"><a href="#summary">Summary</a></h2>
<p>I am pretty happy with Rust and we are going to use it more and more. I am ok to have a bit more development time and significantly reduce the operation efforts we have to put in to maintain the system.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 22 Feb 2023 08:32:01 +0100</pubDate>
    </item>
    <item>
      <title>Misusing Ninja</title>
      <link>https://dev.l1x.be/posts/2023/02/21/misusing-ninja/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/02/21/misusing-ninja/</guid>
      <content:encoded><![CDATA[<p><img src="/static/img/og/ninja_build.webp" alt="Ninja" /></p>
<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p>In the world of software development, build systems are essential tools for compiling and linking source code into executables. While there are many build systems available (make, cmake), one of the most popular and powerful is the Ninja build system.</p>
<p>Developed by Evan Martin while working on the Chromium project at Google, Ninja is a fast, scalable, and cross-platform build system that has gained widespread adoption in the software development community. In this blog post, we'll take a closer look at the Ninja build system and its key features.</p>
<h2 id="how-ninja-works"><a href="#how-ninja-works">How Ninja Works</a></h2>
<p>At its core, Ninja is a simple, low-level build system that aims to be fast and efficient. Ninja works by generating a graph of dependencies between input files and output files and then executing a series of build commands to transform the input files into the output files.</p>
<p>The input files are typically source code files written in a programming language such as C++, Rust, or in our case Python. The output files are the compiled object files, libraries, and executables that result from the build process. And this is where it gets interesting. You can safely ignore this part of Ninja and just use it as a dag orchestrator that executes commands in a certain order.</p>
<p>Ninja uses a file format called &quot;build.ninja&quot; to define the build graph and specify the build commands. The build.ninja file is a text file that describes the dependencies between the input and output files or in our case the build steps.</p>
<h2 id="key-features-of-ninja"><a href="#key-features-of-ninja">Key Features of Ninja</a></h2>
<p>One of the main advantages of Ninja is its speed. Because Ninja generates a build graph that only includes the necessary build steps, it can execute builds quickly and efficiently. Additionally, Ninja can scale to handle large projects with many dependencies, making it a popular choice for building complex software applications. For our use case, it is not that much use but in case you are using it for a C++ project it is pretty good.</p>
<p>Another key feature of Ninja is its cross-platform support. Ninja is designed to work on multiple operating systems, including Windows, macOS, and Linux, which makes it easy to use in a variety of development environments. This is true if you have a way of making sure that the tools that are used in the build are present and have the same CLI on all systems you run your builds on.</p>
<h2 id="getting-started-with-ninja"><a href="#getting-started-with-ninja">Getting Started with Ninja</a></h2>
<p>I usually learn by example and there aren't many examples around. I have read the original documentation and tried a few things out but finally, I understand how all things hang together.</p>
<p>There are 3 things that you need to know about a ninja build file:</p>
<ul>
<li>rule</li>
<li>build</li>
<li>dependency</li>
</ul>
<p>A rule is a description and a command that gets executed when the rule is invoked.</p>
<p>A build step is just calling a rule (or potentially multiple rules using dependencies).</p>
<p>A dependency is a concept of describing a relationship between rules.</p>
<p>Let's have a look at a basic example.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">rule ok
</div><div class="line" data-line="2">  command = echo &quot;ok&quot;
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">build ok: ok
</div></code></pre>
<p>Invoking it:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">ninja</span> <span style="color: #e6edf3;">ok</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">[</span>1/1<span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&quot;ok&quot;</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">ok</span>
</div></code></pre>
<p>Let's add a build that has two steps (sometimes called stages or targets).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">rule one
</div><div class="line" data-line="2">  command = echo &quot;one&quot;
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">rule two
</div><div class="line" data-line="5">  command = echo &quot;two&quot;
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">build one: one
</div><div class="line" data-line="8">build two: two || one
</div></code></pre>
<p>With two pipe characters, we can express dependency between the build stages.</p>
<p>Invoking the build:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">ninja</span> <span style="color: #e6edf3;">one</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">[</span>1/1<span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&quot;one&quot;</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">one</span>
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">❯</span> <span style="color: #e6edf3;">ninja</span> <span style="color: #e6edf3;">two</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">[</span>1/2<span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&quot;one&quot;</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">one</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">[</span>2/2<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&quot;two&quot;</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">two</span>
</div></code></pre>
<p>This means we cannot forget to execute the first step when executing the second.</p>
<h2 id="using-ninja-for-building-and-deploying-to-the-cloud"><a href="#using-ninja-for-building-and-deploying-to-the-cloud">Using Ninja for building and deploying to the cloud</a></h2>
<p>There is no point in writing an article without Docker in it so let's put Docker into Ninja.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">version   = 0.6.0
</div><div class="line" data-line="2">aws_account_id = 11111111111
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">rule login-to-ecr
</div><div class="line" data-line="6">  command = aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin $&lbrace;aws_account_id&rbrace;.dkr.ecr.eu-west-1.amazonaws.com
</div><div class="line" data-line="7">  description = Logging in to ECR
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">rule init
</div><div class="line" data-line="10">  command = env | egrep AWS_PROFILE
</div><div class="line" data-line="11">  description = Displaying AWS_PROFILE
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">
</div><div class="line" data-line="14">rule build-image
</div><div class="line" data-line="15">  command = docker build . -t depoxy:backend-api-$&lbrace;version&rbrace; --file Dockerfile
</div><div class="line" data-line="16">  description = Building depoxy:backend-api-$&lbrace;version&rbrace;
</div><div class="line" data-line="17">
</div><div class="line" data-line="18">rule tag-image
</div><div class="line" data-line="19">  command = docker tag depoxy:backend-api-$&lbrace;version&rbrace; $&lbrace;aws_account_id&rbrace;.dkr.ecr.eu-west-1.amazonaws.com/depoxy:backend-api-$&lbrace;version&rbrace;
</div><div class="line" data-line="20">  description = Tagging depoxy:backend-api-$&lbrace;version&rbrace;
</div><div class="line" data-line="21">
</div><div class="line" data-line="22">rule upload-image
</div><div class="line" data-line="23">  command = docker push $&lbrace;aws_account_id&rbrace;.dkr.ecr.eu-west-1.amazonaws.com/depoxy:backend-api-$&lbrace;version&rbrace;
</div><div class="line" data-line="24">  description = Push depoxy:backend-api-$&lbrace;version&rbrace;
</div><div class="line" data-line="25">
</div><div class="line" data-line="26">
</div><div class="line" data-line="27">
</div><div class="line" data-line="28">build login-to-ecr: login-to-ecr
</div><div class="line" data-line="29">build init: init || login-to-ecr
</div><div class="line" data-line="30">build build-image: build-image || init
</div><div class="line" data-line="31">build tag-image: tag-image || build-image
</div><div class="line" data-line="32">build upload-image: upload-image || tag-image
</div><div class="line" data-line="33">
</div><div class="line" data-line="34">default login-to-ecr init build-image tag-image upload-image
</div></code></pre>
<p>There goes, a simple docker workflow implemented in Ninja. We use similar workflows to upload files to AWS and deploy with Terraform too. This unified our different build efforts mostly using YAML for CI/CD and made sure we do not forget anything while not needing to write a tonne of YAML.</p>
<p>The reduction from the YAML-based workflows is pretty significant:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">+229 −1,633
</div></code></pre>
<h2 id="to-get-started-with-ninja"><a href="#to-get-started-with-ninja">To get started with Ninja</a></h2>
<p>To get started with Ninja, you'll need to install the Ninja build system on your development machine. Ninja can be installed from package managers on the most popular operating systems, or you can download the source code and build it manually.</p>
<p>Once you have Ninja installed, you'll need to create a build.ninja file that describes your project's build process. You can create this file manually, or you can use a build system generator such as gn or Meson to generate it automatically.</p>
<p>Finally, you can run the Ninja build command to execute the build process and create the output files. Ninja will automatically track changes to input files and re-execute the build commands as needed, ensuring that your output files are always up-to-date.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2>
<p>The Ninja build system is a powerful and efficient tool for building software applications. With its speed, scalability, and cross-platform support, Ninja is a great choice for developers working on a wide range of projects. Whether you're building a small utility or a large, complex application, Ninja can help you streamline your build process and get your software up and running quickly. It helped us to move from YAML to a much better alternative while reducing the complexity of the CI/CD workflows. The next is to find a CI/CD platform that supports Ninja files or maybe to create a project that does exactly that.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 21 Feb 2023 14:19:01 +0100</pubDate>
    </item>
    <item>
      <title>New tools</title>
      <link>https://dev.l1x.be/posts/2023/01/03/new-tools/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2023/01/03/new-tools/</guid>
      <content:encoded><![CDATA[<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p>I am amazed by the amount of new software that is high quality, good-looking, and easy to use. Here are some of the favorites that I have encountered. Some of these (probably the majority) are written in Rust.</p>
<h2 id="list-of-new-tools"><a href="#list-of-new-tools">List of new tools</a></h2>
<h3 id="starship"><a href="#starship">Starship</a></h3>
<p>Starship is a simple, fast, and customizable cross-platform prompt for command line shells. It is designed to be lightweight and efficient, while still offering a wide range of customization options to suit your individual needs. With Starship, you can easily switch between different shells and operating systems without having to worry about how to carry your prompt with you. Whether you are a beginner or an experienced command line user, Starship is an excellent tool to improve your productivity and make your workflow more seamless.</p>
<p>You might be familiar with the following already, but this is what it looks like:</p>
<p><img src="/static/img/blog/starship.png" alt="Starship" /></p>
<p><a href="https://starship.rs/">link</a></p>
<h3 id="alacritty"><a href="#alacritty">Alacritty</a></h3>
<p>Alacritty is a high-performance terminal emulator that offers a range of customization options while still maintaining sensible defaults.</p>
<p>These are some of Alacritty's noteworthy features:</p>
<ul>
<li>Vi Mode — Move around and create selections using vi bindings</li>
<li>Search — Search for any text within the scrollback buffer</li>
<li>Regex Hints — Mark any text for mouse or keyboard interaction</li>
<li>Multi-Window — Improve resource usage by using only a single Alacritty process</li>
</ul>
<p><a href="https://alacritty.org/">link</a></p>
<h3 id="helix"><a href="#helix">Helix</a></h3>
<p>Helix is a new text editor that is designed to be fast, intuitive, and customizable. It features a modern, minimalist user interface. My favorite feature is the integration with <a href="https://tree-sitter.github.io/tree-sitter/">Tree-sitter</a> which is probably the fastest parser generator out there.</p>
<p><img src="/static/img/blog/helix.png" alt="Helix" /></p>
<p><a href="https://helix-editor.com/">link</a></p>
<h3 id="ruff"><a href="#ruff">Ruff</a></h3>
<p>After trying to use many other linters before, Ruff does a great job while being extremely fast.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Ruff aims to be orders of magnitude faster than alternative tools while
</div><div class="line" data-line="2">integrating more functionality behind a single, common interface.
</div></code></pre>
<p><a href="https://github.com/charliermarsh/ruff">link</a></p>
<h3 id="ripgrep"><a href="#ripgrep">Ripgrep</a></h3>
<p>ripgrep is a tool that allows you to search for a specific pattern (expressed using a regular expression) within the files in a directory and its subdirectories. It is designed to be fast and efficient, and it has some helpful default behaviors, such as ignoring files and directories that are listed in a .gitignore file and skipping over hidden files and binary files. This makes it a convenient and reliable option for quickly searching through large codebases or other collections of files. It is usually much faster than grep or similar alternatives.</p>
<p><a href="https://github.com/BurntSushi/ripgrep">link</a></p>
<h3 id="exa"><a href="#exa">Exa</a></h3>
<p>exa is a new version of the classic ls command that is included with Unix and Linux operating systems. It provides several additional features and improvements over the original ls, such as the use of color coding to distinguish different file types and metadata. If you are a command line user, you might find exa to be a helpful and modern replacement for the standard ls command.</p>
<p><img src="/static/img/blog/exa.png" alt="Exa" /></p>
<p><a href="https://the.exa.website/">link</a></p>
<h3 id="zola"><a href="#zola">Zola</a></h3>
<p>Zola is a static content management system (CMS) written in Rust. Zola is designed to be fast, easy to use, and flexible, making it a good choice for building static websites and blogs. Some of the key features of Zola include its Markdown support using CommonMark, a strongly defined, highly compatible specification of Markdown, Sass compilation, syntax highlighting, and the ability to generate a table of contents. It is less featureful than Hugo though.</p>
<p><a href="https://www.getzola.org/">link</a></p>
<h3 id="duf"><a href="#duf">Duf</a></h3>
<p>Duf is a disk usage utility (similar to the widely used df) for displaying free disk space in a visually more appealing manner.</p>
<p><img src="/static/img/blog/duf.png" alt="Duf" /></p>
<h2 id="closing"><a href="#closing">Closing</a></h2>
<p>These are the new tools that I use nowadays hopefully it is useful for some of you as well.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 03 Jan 2023 13:54:01 +0100</pubDate>
    </item>
    <item>
      <title>Using LDAP in Docker with caching</title>
      <link>https://dev.l1x.be/posts/2021/03/22/using-ldap-in-docker-with-caching/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2021/03/22/using-ldap-in-docker-with-caching/</guid>
      <content:encoded><![CDATA[<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p><a href="https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol">LDAP</a> and <a href="https://en.wikipedia.org/wiki/Active_Directory">Active Directory</a> are both directory services. These are used extensively in corporate environments as authentication and authorization solutions. In a modern infrastructure, applications quite often run in a containerized environment, and sometimes these environments need to have access to LDAP or AD. In this article, we investigate how a Docker container can efficiently access these directory services.</p>
<h2 id="creating-a-base-container"><a href="#creating-a-base-container">Creating a base container</a></h2>
<p>Ubuntu 18.04.5 LTS (Bionic Beaver) is a long-term release, and this is what we are going to use going forward. There must be a local user other than root for tasks like compiling code or installing NPM packages if necessary.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM ubuntu:bionic-20210222
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">RUN apt-get clean &amp;&amp; apt-get update
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">RUN apt-get install --reinstall ca-certificates -y
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">RUN apt-get install sudo apt-transport-https ca-certificates -y
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">RUN useradd -m -u 5000 app || :
</div><div class="line" data-line="10">RUN echo &#39;app ALL=(ALL) NOPASSWD:ALL&#39; &gt;&gt; /etc/sudoers
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">RUN apt-get clean
</div><div class="line" data-line="13">
</div><div class="line" data-line="14">USER app
</div></code></pre>
<p>Creating a new tag is simple:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">local/aloha-base:2021.03.23</span> <span style="color: #e6edf3;">--file</span> <span style="color: #e6edf3;">docker/0/Dockerfile</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Sending</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">context</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">Docker</span> <span style="color: #e6edf3;">daemon</span>  <span style="color: #e6edf3;">229.9kB</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">1/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">ubuntu:bionic-20210222</span>
</div><div class="line" data-line="5"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">329ed837d508</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">2/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">apt-get</span> <span style="color: #e6edf3;">clean</span> <span style="color: #79c0ff;">&amp;&amp;</span> <span style="color: #d2a8ff;">apt-get</span> <span style="color: #e6edf3;">update</span>
</div><div class="line" data-line="7"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="8"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">3fd8e7b8ccdc</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">3/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">apt-get</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">--reinstall</span> <span style="color: #e6edf3;">ca-certificates</span> <span style="color: #e6edf3;">-y</span>
</div><div class="line" data-line="10"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="11"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">d1d2a538daf2</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">4/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">apt-get</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">apt-transport-https</span> <span style="color: #e6edf3;">ca-certificates</span> <span style="color: #e6edf3;">-y</span>
</div><div class="line" data-line="13"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="14"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">bc8142001e3e</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">5/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">useradd</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">-u</span> <span style="color: #79c0ff;">5000</span> <span style="color: #e6edf3;">admin</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">:</span>
</div><div class="line" data-line="16"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">9939670ed4aa</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">9939670ed4aa</span>
</div><div class="line" data-line="18"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">ca9a0f390db3</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">6/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">useradd</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">-u</span> <span style="color: #79c0ff;">5001</span> <span style="color: #e6edf3;">app</span> <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">:</span>
</div><div class="line" data-line="20"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">68cd99b566b5</span>
</div><div class="line" data-line="21"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">68cd99b566b5</span>
</div><div class="line" data-line="22"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">37c5b5146df9</span>
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">7/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">echo</span> <span style="color: #a5d6ff;">&#39;admin ALL=(ALL) NOPASSWD:ALL&#39;</span> <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">/etc/sudoers</span>
</div><div class="line" data-line="24"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">a82a060ba035</span>
</div><div class="line" data-line="25"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">a82a060ba035</span>
</div><div class="line" data-line="26"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">d3e527406739</span>
</div><div class="line" data-line="27"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">8/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">apt-get</span> <span style="color: #e6edf3;">clean</span>
</div><div class="line" data-line="28"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">9e4d7b7ac071</span>
</div><div class="line" data-line="29"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">9e4d7b7ac071</span>
</div><div class="line" data-line="30"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">117422e5341e</span>
</div><div class="line" data-line="31"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">9/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">USER</span> <span style="color: #e6edf3;">app</span>
</div><div class="line" data-line="32"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">b28f471b4c5d</span>
</div><div class="line" data-line="33"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">b28f471b4c5d</span>
</div><div class="line" data-line="34"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">5f025336ed65</span>
</div><div class="line" data-line="35"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">built</span> <span style="color: #e6edf3;">5f025336ed65</span>
</div><div class="line" data-line="36"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">tagged</span> <span style="color: #e6edf3;">local/aloha-base:2021.03.23</span>
</div></code></pre>
<p>This image has the two users that we need. One is an app that will be used to run the application process and admin that can be used to execute further modifications (installing packages, changing config, etc.) if necessary. I usually split Docker container building into multiple separate Dockerfiles, because quite often, these steps take long enough to have a real impact on the overall build time. In some cases, I cannot rely on Docker caching, so I have a few steps, and the last step is application-specific, reducing the CI/CD time to the absolute minimum.</p>
<h2 id="creating-the-s6-container-based-on-base"><a href="#creating-the-s6-container-based-on-base">Creating the S6 container (based on base)</a></h2>
<p><a href="https://skarnet.org/software/s6/overview.html">S6</a> is a lightweight process supervision suite for managing long-lived processes in a UNIX-like operating system or in a container. The question if multi-process containers are a good idea or not is not in scope here.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM local/aloha-base:2021.03.23
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">USER admin
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">RUN sudo apt update
</div><div class="line" data-line="6">RUN sudo DEBIAN_FRONTEND=noninteractive apt install libpam-ldap unscd software-properties-common -y
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">RUN sudo add-apt-repository ppa:deadsnakes/ppa
</div><div class="line" data-line="9">RUN sudo apt update
</div><div class="line" data-line="10">RUN sudo apt install python3.9 -y
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">ADD https://github.com/just-containers/s6-overlay/releases/download/v2.2.0.3/s6-overlay-amd64-installer /tmp/
</div><div class="line" data-line="13">RUN sudo chmod +x /tmp/s6-overlay-amd64-installer &amp;&amp; sudo /tmp/s6-overlay-amd64-installer /
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">local/aloha-base:2021.03.23</span> <span style="color: #e6edf3;">--file</span> <span style="color: #e6edf3;">docker/0/Dockerfile</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Sending</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">context</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">Docker</span> <span style="color: #e6edf3;">daemon</span>  <span style="color: #e6edf3;">239.1kB</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">1/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">local/aloha-base:2021.03.23</span>
</div><div class="line" data-line="4"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">5f025336ed65</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">2/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">USER</span> <span style="color: #e6edf3;">admin</span>
</div><div class="line" data-line="6"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">d987f02390a0</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">d987f02390a0</span>
</div><div class="line" data-line="8"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">d3d934f11021</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">3/9</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">apt</span> <span style="color: #e6edf3;">update</span>
</div><div class="line" data-line="10"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">0b4e504d1985</span>
</div><div class="line" data-line="11">
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">...</span>
</div><div class="line" data-line="13">
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">./usr/bin/execlineb./usr/bin/justc-envdir</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">./usr/bin/fix-attrs</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">./usr/bin/printcontenv</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">./usr/bin/logutil-service-main</span>
</div><div class="line" data-line="18"><span style="color: #d2a8ff;">./usr/bin/logutil-newfifo</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">0a047b2e0caf</span>
</div><div class="line" data-line="20"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">9f8947529cba</span>
</div><div class="line" data-line="21"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">built</span> <span style="color: #e6edf3;">9f8947529cba</span>
</div><div class="line" data-line="22"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">tagged</span> <span style="color: #e6edf3;">local/aloha-s6:2021.03.23</span>
</div></code></pre>
<p>This image has S6 and Python 3.9, and we can use these to add our application and NSCD as well.</p>
<h2 id="creating-the-application-container-based-on-s6"><a href="#creating-the-application-container-based-on-s6">Creating the application container (based on s6)</a></h2>
<p><a href="https://github.com/l1x/aloha">This repository</a> contains all the configurations for ldap, nss, pam and nscd. S6 uses simple files that are organized in a tree where the folder name is the service name. There are two files in each of the service folders:</p>
<ul>
<li>run</li>
<li>finish</li>
</ul>
<p>More details about service directories: <a href="https://skarnet.org/software/s6/servicedir.html">link</a>.</p>
<p>Example run file for NSCD:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">#!/usr/bin/with-contenv sh</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">echo</span> <span style="color: #79c0ff;">&gt;&amp;</span><span style="color: #79c0ff;">2</span> <span style="color: #a5d6ff;">&quot;Starting unscd...&quot;</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">exec</span> <span style="color: #e6edf3;">/usr/sbin/nscd</span> <span style="color: #e6edf3;">-d</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">/etc/nscd.conf</span>
</div></code></pre>
<p>This folder is then copied to the container, and S6 will pick it up once the container is starting.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM local/aloha-s6:2021.03.23
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">USER admin
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">ADD ldap.conf /etc/ldap.conf
</div><div class="line" data-line="6">ADD nsswitch.conf /etc/nsswitch.conf
</div><div class="line" data-line="7">ADD login /etc/pam.d/login
</div><div class="line" data-line="8">ADD nscd.conf  /etc/nscd.conf
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">RUN sudo mkdir -p /var/run/nscd/
</div><div class="line" data-line="11">RUN sudo chown unscd:unscd /var/run/nscd/
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">COPY ./services /etc/services.d/
</div><div class="line" data-line="14">
</div><div class="line" data-line="15">ADD sudo-init /sudo-init
</div><div class="line" data-line="16">
</div><div class="line" data-line="17">ENTRYPOINT [ &quot;/sudo-init&quot; ]
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">local/aloha-app:2021.03.23</span> <span style="color: #e6edf3;">--file</span> <span style="color: #e6edf3;">Dockerfile</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Sending</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">context</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">Docker</span> <span style="color: #e6edf3;">daemon</span>  <span style="color: #e6edf3;">243.7kB</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">1/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">local/aloha-s6:2021.03.23</span>
</div><div class="line" data-line="4"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">9f8947529cba</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">2/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">USER</span> <span style="color: #e6edf3;">admin</span>
</div><div class="line" data-line="6"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="7"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">df4413ee8a64</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">3/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ADD</span> <span style="color: #e6edf3;">ldap.conf</span> <span style="color: #e6edf3;">/etc/ldap.conf</span>
</div><div class="line" data-line="9"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="10"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">8bc9d31a124b</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">4/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ADD</span> <span style="color: #e6edf3;">nsswitch.conf</span> <span style="color: #e6edf3;">/etc/nsswitch.conf</span>
</div><div class="line" data-line="12"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="13"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">0dcf1793bd20</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">5/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ADD</span> <span style="color: #e6edf3;">login</span> <span style="color: #e6edf3;">/etc/pam.d/login</span>
</div><div class="line" data-line="15"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="16"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">9aa90594b708</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">6/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ADD</span> <span style="color: #e6edf3;">nscd.conf</span>  <span style="color: #e6edf3;">/etc/nscd.conf</span>
</div><div class="line" data-line="18"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="19"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">343ff90841bd</span>
</div><div class="line" data-line="20"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">7/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">mkdir</span> <span style="color: #e6edf3;">-p</span> <span style="color: #e6edf3;">/var/run/nscd/</span>
</div><div class="line" data-line="21"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="22"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">5ca5e87d0a72</span>
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">8/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">chown</span> <span style="color: #e6edf3;">unscd:unscd</span> <span style="color: #e6edf3;">/var/run/nscd/</span>
</div><div class="line" data-line="24"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="25"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">168dbcecba14</span>
</div><div class="line" data-line="26"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">9/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">COPY</span> <span style="color: #e6edf3;">./services</span> <span style="color: #e6edf3;">/etc/services.d/</span>
</div><div class="line" data-line="27"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="28"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">a6ad71fd377f</span>
</div><div class="line" data-line="29"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">10/10</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ADD</span> <span style="color: #e6edf3;">sudo-init</span> <span style="color: #e6edf3;">/sudo-init</span>
</div><div class="line" data-line="30"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="31"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">d4dbc8afb7a7</span>
</div><div class="line" data-line="32"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">built</span> <span style="color: #e6edf3;">d4dbc8afb7a7</span>
</div><div class="line" data-line="33"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">tagged</span> <span style="color: #e6edf3;">local/aloha-app:2021.03.23</span>
</div></code></pre>
<h2 id="running-the-container"><a href="#running-the-container">Running the container</a></h2>
<p>Finally we can run the container. I usually use <a href="https://www.nomadproject.io/">Nomad</a> for production workloads but it is possible to spin up the application locally and just try out if everything works.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"> <span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">run</span> <span style="color: #e6edf3;">--rm</span> <span style="color: #e6edf3;">-ti</span> <span style="color: #e6edf3;">local/aloha-app:2021.03.23</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">[</span>s6-init<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">making</span> <span style="color: #e6edf3;">user</span> <span style="color: #e6edf3;">provided</span> <span style="color: #e6edf3;">files</span> <span style="color: #e6edf3;">available</span> <span style="color: #e6edf3;">at</span> <span style="color: #e6edf3;">/var/run/s6/etc...exited</span> <span style="color: #e6edf3;">0.</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">[</span>s6-init<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">ensuring</span> <span style="color: #e6edf3;">user</span> <span style="color: #e6edf3;">provided</span> <span style="color: #e6edf3;">files</span> <span style="color: #e6edf3;">have</span> <span style="color: #e6edf3;">correct</span> <span style="color: #e6edf3;">perms...exited</span> <span style="color: #e6edf3;">0.</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">[</span>fix-attrs.d<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">applying</span> <span style="color: #e6edf3;">ownership</span> <span style="color: #e6edf3;">&amp;</span> <span style="color: #d2a8ff;">permissions</span> <span style="color: #e6edf3;">fixes...</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">[</span>fix-attrs.d<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">done.</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">[</span>cont-init.d<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">executing</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">initialization</span> <span style="color: #e6edf3;">scripts...</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">[</span>cont-init.d<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">done.</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">[</span>services.d<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">starting</span> <span style="color: #e6edf3;">services</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Starting</span> <span style="color: #e6edf3;">web</span> <span style="color: #e6edf3;">app...</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">Starting</span> <span style="color: #e6edf3;">unscd...</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">[</span>services.d<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">done.</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">unscd</span> <span style="color: #e6edf3;">v0.52,</span> <span style="color: #e6edf3;">debug</span> <span style="color: #e6edf3;">level</span> <span style="color: #79c0ff;">0x1</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">aged</span> <span style="color: #e6edf3;">cache,</span> <span style="color: #e6edf3;">freed:0,</span> <span style="color: #e6edf3;">remain:0</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">service</span> <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">is</span> <span style="color: #e6edf3;">disabled,</span> <span style="color: #e6edf3;">dropping</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">Serving</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">on</span> <span style="color: #e6edf3;">0.0.0.0</span> <span style="color: #e6edf3;">port</span> <span style="color: #79c0ff;">8000</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">http://0.0.0.0:8000/</span><span style="color: #e6edf3;">)</span> ...
</div></code></pre>
<p>I have enabled nscd debug, so it prints the debug messages on the console. It can be either entirely disabled or redirected into a log file. There are other ways of having LDAP enabled containers, some people use <a href="https://sssd.io/">SSSD</a>. Because I was already familiar with NSCD I went down this path.</p>
<p>If you know a better way of cached LDAP lookups in a container, let me know in the <a href="">HN thread</a>.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 22 Mar 2021 14:10:11 +0100</pubDate>
    </item>
    <item>
      <title>Compressing data with Parquet</title>
      <link>https://dev.l1x.be/posts/2021/03/08/compressing-data-with-parquet/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2021/03/08/compressing-data-with-parquet/</guid>
      <content:encoded><![CDATA[<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p>Many times I see that people use <a href="https://www.sqlite.org/index.html">Sqlite</a> for distributing large datasets. When the use case is analytical (OLAP), there are often better options. We are going to investigate how much better we could do if we use something other than Sqlite. To make sure, I love Sqlite and use it a lot when a simple SQL single file database does it. For this particular use case, I think using <a href="http://parquet.apache.org/">Parquet</a> is better suited. We are going to explore why.</p>
<h2 id="the-rise-of-columnar-formats"><a href="#the-rise-of-columnar-formats">The rise of columnar formats</a></h2>
<p>A while back, when <a href="https://research.fb.com/wp-content/uploads/2011/01/rcfile-a-fast-and-space-efficient-data-placement-structure-in-mapreduce-based-warehouse-systems.pdf">Facebook and Ohio State University investigated</a> what would be the best option to store a large volume of data not too surprisingly, a columnar system came out to be the winner. There are many reasons why and I am not going to go into the details in this article because I do not have that much time. The point is that if you have repetition in your data, a columnar format can be compressed much better than a row-oriented format. On top of that, if you run a query that queries only a subset of the table fields, columnar can skip reading a whole lot of data that speeds up processing. Furthermore, a strong compression (gzip, brotli, zstd) is applied to columnar files, making those even smaller. Smaller files are preferable because the slowest part of data processing is still disk IO.</p>
<h2 id="the-use-case"><a href="#the-use-case">The use case</a></h2>
<p>While reading through a <a href="https://news.ycombinator.com/item?id=26371706">post on HN</a>, I have run into this comment from <a href="https://news.ycombinator.com/user?id=zomglings">zomglings</a> explaining that they got a dataset that is an export of some Github data.</p>
<blockquote>
<p>The dataset for a single crawl comes in at about 60GB. We uploaded the data to Kaggle because we thought it would be a good place for people to work with &gt; the data. Unfortunately, the Kaggle notebook experience is not tailored to such large datasets. Our dataset is in an SQLite database. It takes a long time &gt; for the dataset to load into Kaggle notebooks, and I don't think they are provisioned with SSDs as queries take a long time. Our best workaround to this
is to partition into 3 datasets on Kaggle - train, eval, and development, but it will be a pain to manage this for every update, especially as we enrich
the dataset with results of static analysis, etc.</p>
</blockquote>
<p>I was wondering if we could do better than the SQLite version.</p>
<h2 id="the-initial-dataset"><a href="#the-initial-dataset">The initial dataset</a></h2>
<p>After downloading the sample dataset from <a href="https://www.kaggle.com/simiotic/github-code-snippets-development-sample">Kaggle</a> I started to explore a bit.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1"> sqlite3 -header -csv -readonly -header snippets-dev.db &#39;.schema snippets&#39;
</div><div class="line" data-line="2">CREATE TABLE snippets (
</div><div class="line" data-line="3">    id INTEGER PRIMARY KEY,
</div><div class="line" data-line="4">    snippet TEXT NOT NULL,
</div><div class="line" data-line="5">    language TEXT NOT NULL,
</div><div class="line" data-line="6">    repo_file_name TEXT,
</div><div class="line" data-line="7">    github_repo_url TEXT,
</div><div class="line" data-line="8">    license TEXT,
</div><div class="line" data-line="9">    commit_hash TEXT,
</div><div class="line" data-line="10">    starting_line_number INTEGER,
</div><div class="line" data-line="11">    chunk_size INTEGER,
</div><div class="line" data-line="12">    UNIQUE(commit_hash, repo_file_name, github_repo_url, chunk_size, starting_line_number)
</div><div class="line" data-line="13">);
</div><div class="line" data-line="14">CREATE INDEX snippets_github_repo_url on snippets(github_repo_url);
</div><div class="line" data-line="15">CREATE INDEX snippets_license on snippets(license);
</div><div class="line" data-line="16">CREATE INDEX snippets_language on snippets(language);
</div></code></pre>
<p>The first thing I found that there is a weird spacing going on with the commit_hash column:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">sqlite&gt; SELECT commit_hash FROM snippets LIMIT 10;
</div><div class="line" data-line="2">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="5">
</div><div class="line" data-line="6">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="13">
</div><div class="line" data-line="14">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="15">
</div><div class="line" data-line="16">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="17">
</div><div class="line" data-line="18">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="19">
</div><div class="line" data-line="20">000427352ad89da7fb4325037c116a3b06745608
</div></code></pre>
<p>I realized that there is a trailing newline for each commit hash. I quickly fixed that.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">sqlite&gt; UPDATE snippets SET commit_hash = REPLACE(commit_hash, CHAR(10), &#39;&#39;);
</div><div class="line" data-line="2">sqlite&gt; UPDATE snippets SET commit_hash = REPLACE(commit_hash, CHAR(13), &#39;&#39;);
</div></code></pre>
<p>Now the commit hashes looked ok.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">sqlite&gt; SELECT commit_hash FROM snippets LIMIT 10;
</div><div class="line" data-line="2">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="3">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="4">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="5">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="6">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="7">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="8">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="9">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="10">000427352ad89da7fb4325037c116a3b06745608
</div><div class="line" data-line="11">000427352ad89da7fb4325037c116a3b06745608
</div></code></pre>
<p>Before I continued working with the dataset, I quickly vacuumed that database:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">sqlite&gt; PRAGMA auto_vacuum = FULL;
</div><div class="line" data-line="2">sqlite&gt; VACUUM;
</div></code></pre>
<p>It reduced the size by 100MB.</p>
<p>Next step I just looked into the fields (skipping the snippet part):</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">sqlite&gt; SELECT id
</div><div class="line" data-line="2">  , language
</div><div class="line" data-line="3">  , repo_file_name
</div><div class="line" data-line="4">  , github_repo_url
</div><div class="line" data-line="5">  , license
</div><div class="line" data-line="6">  , commit_hash
</div><div class="line" data-line="7">  , starting_line_number
</div><div class="line" data-line="8">  , chunk_size
</div><div class="line" data-line="9">FROM snippets
</div><div class="line" data-line="10">LIMIT 10;
</div><div class="line" data-line="11">id          language    repo_file_name            github_repo_url                   license     commit_hash                               starting_line_number  chunk_size
</div><div class="line" data-line="12">----------  ----------  ------------------------  --------------------------------  ----------  ----------------------------------------  --------------------  ----------
</div><div class="line" data-line="13">491         DOTFILE     NodeBB/NodeBB/.gitignore  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  65                    5
</div><div class="line" data-line="14">512         UNKNOWN     NodeBB/NodeBB/LICENSE     https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  100                   5
</div><div class="line" data-line="15">584         UNKNOWN     NodeBB/NodeBB/LICENSE     https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  460                   5
</div><div class="line" data-line="16">610         UNKNOWN     NodeBB/NodeBB/LICENSE     https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  590                   5
</div><div class="line" data-line="17">627         JavaScript  NodeBB/NodeBB/test/group  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  5                     5
</div><div class="line" data-line="18">638         JavaScript  NodeBB/NodeBB/test/group  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  60                    5
</div><div class="line" data-line="19">646         JavaScript  NodeBB/NodeBB/test/group  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  100                   5
</div><div class="line" data-line="20">673         JavaScript  NodeBB/NodeBB/test/group  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  235                   5
</div><div class="line" data-line="21">690         JavaScript  NodeBB/NodeBB/test/group  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  320                   5
</div><div class="line" data-line="22">714         JavaScript  NodeBB/NodeBB/test/group  https://github.com/NodeBB/NodeBB  GPL-3.0     21634e2681fb1329bcbab7b2e19418ebdb1012e1  440                   5
</div></code></pre>
<p>Based on this, I saw that there is a potentially high repetition in many of the fields. It can be quickly verified:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">SELECT
</div><div class="line" data-line="2">  COUNT (DISTINCT language) AS  language_dcnt
</div><div class="line" data-line="3">  , COUNT (DISTINCT repo_file_name) AS repo_file_name_dcnt
</div><div class="line" data-line="4">  , COUNT (DISTINCT github_repo_url) AS github_repo_url_dcnt
</div><div class="line" data-line="5">  , COUNT (DISTINCT license) AS license_dcnt
</div><div class="line" data-line="6">  , COUNT (DISTINCT commit_hash) AS commit_hash_dcnt
</div><div class="line" data-line="7">  , COUNT (DISTINCT starting_line_number) AS starting_line_number_dcnt
</div><div class="line" data-line="8">  , COUNT (DISTINCT chunk_size) AS chunk_size_dcnt
</div><div class="line" data-line="9">FROM snippets LIMIT 1;
</div><div class="line" data-line="10">
</div><div class="line" data-line="11">language_dcnt
</div><div class="line" data-line="12">-------------
</div><div class="line" data-line="13">21
</div><div class="line" data-line="14">
</div><div class="line" data-line="15">repo_file_name_dcnt
</div><div class="line" data-line="16">-------------------
</div><div class="line" data-line="17">1150333
</div><div class="line" data-line="18">
</div><div class="line" data-line="19">github_repo_url_dcnt
</div><div class="line" data-line="20">--------------------
</div><div class="line" data-line="21">1621
</div><div class="line" data-line="22">
</div><div class="line" data-line="23">license_dcnt
</div><div class="line" data-line="24">------------
</div><div class="line" data-line="25">21
</div><div class="line" data-line="26">
</div><div class="line" data-line="27">commit_hash_dcnt
</div><div class="line" data-line="28">----------------
</div><div class="line" data-line="29">1621
</div><div class="line" data-line="30">
</div><div class="line" data-line="31">starting_line_number_dcnt
</div><div class="line" data-line="32">-------------------------
</div><div class="line" data-line="33">27833
</div><div class="line" data-line="34">
</div><div class="line" data-line="35">chunk_size_dcnt
</div><div class="line" data-line="36">---------------
</div><div class="line" data-line="37">1
</div></code></pre>
<h2 id="exporting-data"><a href="#exporting-data">Exporting data</a></h2>
<p>Based on the previous distinct counts, I decided to export the data the following way:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">sqlite3 -header -csv -readonly snippets-dev.db &#39;SELECT * FROM snippets ORDER BY chunk_size, license, language, github_repo_url, commit_hash&#39; &gt; test1.csv
</div></code></pre>
<p>Using the freshly exported file, I can now convert the CSV to Parquet with default settings first.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">import</span> <span style="color: #e6edf3;">pandas</span> <span style="color: #ff7b72;">as</span> <span style="color: #e6edf3;">pd</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">import</span> <span style="color: #e6edf3;">sys</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">df</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">pd</span>.<span style="color: #79c0ff;">read_csv</span>(<span style="color: #e6edf3;">sys</span>.<span style="color: #79c0ff;">argv</span>[<span style="color: #79c0ff;">1</span>])
</div><div class="line" data-line="4"><span style="color: #e6edf3;">df</span>.<span style="color: #79c0ff;">to_parquet</span>(<span style="color: #e6edf3;">sys</span>.<span style="color: #79c0ff;">argv</span>[<span style="color: #79c0ff;">2</span>])
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">python</span> <span style="color: #e6edf3;">par.py</span> <span style="color: #e6edf3;">test1.csv</span> <span style="color: #e6edf3;">test1.parquet</span>
</div></code></pre>
<p>The results are already good:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">2.9G</span> <span style="color: #e6edf3;">Mar</span>  <span style="color: #79c0ff;">8</span> <span style="color: #e6edf3;">08:37</span> <span style="color: #e6edf3;">snippets-dev.db</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">427M</span> <span style="color: #e6edf3;">Mar</span>  <span style="color: #79c0ff;">8</span> <span style="color: #e6edf3;">14:05</span> <span style="color: #e6edf3;">test1.parquet</span>
</div></code></pre>
<p>Now we can import the CSV file into a DataFrame using Python:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">import</span> <span style="color: #e6edf3;">pandas</span> <span style="color: #ff7b72;">as</span> <span style="color: #e6edf3;">pd</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">import</span> <span style="color: #e6edf3;">sys</span>
</div><div class="line" data-line="3">
</div><div class="line" data-line="4"><span style="color: #e6edf3;">df</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">pd</span>.<span style="color: #79c0ff;">read_parquet</span>(<span style="color: #e6edf3;">sys</span>.<span style="color: #79c0ff;">argv</span>[<span style="color: #79c0ff;">1</span>], <span style="color: #e6edf3;">engine</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;pyarrow&#39;</span>)
</div></code></pre>
<p>Then I remembered that the default compression for Parquet in the Python library is still snappy for some weird reason. Changing that to gzip, we can get much better.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">df</span>.<span style="color: #79c0ff;">to_parquet</span>(<span style="color: #e6edf3;">sys</span>.<span style="color: #79c0ff;">argv</span>[<span style="color: #79c0ff;">2</span>], <span style="color: #e6edf3;">compression</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;gzip&#39;</span>)
</div></code></pre>
<p>After creating another version, compressing it with gzip, this time the result is much better.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">265M</span> <span style="color: #e6edf3;">Mar</span>  <span style="color: #79c0ff;">8</span> <span style="color: #e6edf3;">17:28</span> <span style="color: #e6edf3;">test1.gzip.parquet</span>
</div></code></pre>
<p>Results:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">Sqlite:</span>           <span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">|</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">|</span>  <span style="color: #d2a8ff;">3000MB</span> <span style="color: #e6edf3;">100%</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Parquet</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">Snappy</span><span style="color: #e6edf3;">)</span>:  <span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">||</span>                            <span style="color: #d2a8ff;">427MB</span>  <span style="color: #e6edf3;">14.2%</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Parquet</span><span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">Gzip</span><span style="color: #e6edf3;">)</span>:    <span style="color: #79c0ff;">||</span><span style="color: #79c0ff;">|</span>                             <span style="color: #d2a8ff;">265MB</span>  <span style="color: #e6edf3;">8.83%</span>
</div></code></pre>
<h2 id="closing"><a href="#closing">Closing</a></h2>
<p>91.17% reduction with Parquet is not that bad. If you need some help with data engineering or have any performance related question (especially querying large datasets) feel free to reach out. See the links below.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 08 Mar 2021 20:38:21 +0100</pubDate>
    </item>
    <item>
      <title>Compressing AWS S3 logs after getting HackerNewsed</title>
      <link>https://dev.l1x.be/posts/2020/12/20/compressing-aws-s3-logs-after-getting-hackernewsed/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/12/20/compressing-aws-s3-logs-after-getting-hackernewsed/</guid>
      <content:encoded><![CDATA[<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p>One of my previous articles about Firecracker and RPI got posted on HN, and I just realized that many months ago, I enabled logging on the S3 bucket hosting this content. I quickly wanted to peek into the stats, and when I discovered that Athena could not process compressed S3 logs.</p>
<p>I was already working on a larger AWS codebase in F#, so I decided to write a tool that can download the raw logs from S3 and merge all the small files, convert it to Parquet and upload those back.</p>
<p>C# has excellent AWS libraries, and after a bit of wrapping, those are suitable for F# dvelopment.</p>
<h2 id="processing-log-files"><a href="#processing-log-files">Processing log files</a></h2>
<p>First, I just created types so I can handle error on the caller side:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">// S3ReadError can be used for other read oprations not just for Get
</div><div class="line" data-line="2">type S3ReadError =
</div><div class="line" data-line="3">  | NotFound of key: string
</div><div class="line" data-line="4">  | S3ReadPermissionDenied of keyOrPrefix: string
</div><div class="line" data-line="5">  | S3ReadException of keyOrPrefix: string * isRecoverable: bool * httpStatus: int option * ex: Exception option
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">type S3GetBytesSuccess = S3GetBytesSuccess of key: string * value: byte []
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">type S3GetBytesReturn = Result&lt;S3GetBytesSuccess, S3ReadError&gt;
</div></code></pre>
<p>AWS already makes you use async code, but it is easy to turn that into sync calls if you do not want to have async in your code. Bob already pointed out why async as we use it in many languages is problematic.</p>
<p><a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">What color is your function</a></p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">//
</div><div class="line" data-line="2">// READ - GET - ASYNC
</div><div class="line" data-line="3">//
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">member this.GetS3ObjectBytesAsync (bucket: string) (key: string): Async&lt;S3GetBytesReturn&gt; =
</div><div class="line" data-line="6">async &lbrace;
</div><div class="line" data-line="7">  try
</div><div class="line" data-line="8">    let! ct = Async.CancellationToken
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">    let request =
</div><div class="line" data-line="11">    GetObjectRequest(BucketName = bucket, Key = key)
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">    let task = awsS3Client.GetObjectAsync(request, ct)
</div><div class="line" data-line="14">
</div><div class="line" data-line="15">    let! result = task |&gt; Async.AwaitTask
</div><div class="line" data-line="16">
</div><div class="line" data-line="17">    match result.HttpStatusCode with
</div><div class="line" data-line="18">    | HttpStatusCode.OK -&gt; return Ok(S3GetBytesSuccess(key, (readAllBytes result.ResponseStream)))
</div><div class="line" data-line="19">    | httpStatus -&gt; return Error(S3ReadException(key, false, (Some(int httpStatus)), None))
</div><div class="line" data-line="20">
</div><div class="line" data-line="21">  with ex -&gt; return Error(handleReadException key ex)
</div><div class="line" data-line="22">&rbrace;
</div><div class="line" data-line="23">
</div><div class="line" data-line="24">
</div><div class="line" data-line="25">//
</div><div class="line" data-line="26">// READ - GET - SYNC
</div><div class="line" data-line="27">//
</div><div class="line" data-line="28">
</div><div class="line" data-line="29">member this.GetS3ObjectBytes (bucket: string) (key: string): S3GetBytesReturn =
</div><div class="line" data-line="30">this.GetS3ObjectBytesAsync bucket key
</div><div class="line" data-line="31">|&gt; Async.RunSynchronously
</div></code></pre>
<p>With such functions, it is straightforward to write parallel code, where we can control parallelism. This is the first version. I use a global state variable, which not very idiomatic in functional programming, but since this is a simple linear execution with single points of mutations, it does not matter. I quite often see functional programming going to the extreme and declaring that all mutations are evil. This is why F# is one of the most productive languages out there because it lets me do mutations when I need those.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">let doDownloadFiles (fileStates: Dictionary&lt;string, FileState&gt;) (s3v2: S3v2) (localFolder: string) (bucket: string) =
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">  let asynTaskList =
</div><div class="line" data-line="4">    fileStates
</div><div class="line" data-line="5">    |&gt; Seq.map (fun fileEntry -&gt;
</div><div class="line" data-line="6">        async &lbrace;
</div><div class="line" data-line="7">          match (downloadFile s3v2 localFolder bucket fileEntry.Key) with
</div><div class="line" data-line="8">          | Ok _x -&gt; return (fileEntry.Key, Downloaded)
</div><div class="line" data-line="9">          | Error err -&gt; return (fileEntry.Key, (FileStateError err))
</div><div class="line" data-line="10">        &rbrace;)
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">  Async.Parallel(asynTaskList, 10)
</div><div class="line" data-line="13">  |&gt; Async.RunSynchronously
</div><div class="line" data-line="14">  |&gt; Seq.iter (fun (k, v) -&gt; fileStates.[k] &lt;- v)
</div></code></pre>
<p>The complete code is here:</p>
<p><a href="https://github.com/l1x/s3logs">https://github.com/l1x/s3logs</a></p>
<h2 id="visualization"><a href="#visualization">Visualization</a></h2>
<p>After I could download, process the text files, and upload the Parquet files, I was looking to visualize the data. I am a long-time fan of <a href="https://plotly.com/">Plotly</a> for many reasons, and they have a project called <a href="https://plotly.com/dash/">Dash</a> that I wanted to try for a long time.</p>
<p>There must be a bit of reshuffling of data for processing weblogs, maybe some aggregation in many cases. For that, Python's Pandas library is an ok choice. I am not saying it is excellent because there are many ways of doing the same thing, the error messages are not clear, and after working with it for 5 years, I still could not use it without reading its documentation and StackOverflow, often both.</p>
<h3 id="referers"><a href="#referers">Referers</a></h3>
<p>The first metric I was curious about is top referrers. The &quot;null&quot; values must be removed, and then we can then use Pandas and groupby to get the top 15 referrers.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">df_ref</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df</span>[<span style="color: #e6edf3;">df</span>[<span style="color: #a5d6ff;">&#39;CsReferer&#39;</span>] <span style="color: #79c0ff;">!=</span> <span style="color: #a5d6ff;">&#39;-&#39;</span>]
</div><div class="line" data-line="2"><span style="color: #e6edf3;">filter_self</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df_ref</span>[<span style="color: #a5d6ff;">&#39;CsReferer&#39;</span>].<span style="color: #79c0ff;">str</span>.<span style="color: #79c0ff;">contains</span>(<span style="color: #a5d6ff;">&#39;dev\\.l1x\\.be&#39;</span>)
</div><div class="line" data-line="3"><span style="color: #e6edf3;">df_ref</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df_ref</span>[<span style="color: #79c0ff;">~</span><span style="color: #e6edf3;">filter_self</span>]
</div><div class="line" data-line="4"><span style="color: #e6edf3;">top_ref</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df_ref</span>.<span style="color: #79c0ff;">groupby</span>([<span style="color: #a5d6ff;">&#39;CsReferer&#39;</span>])[<span style="color: #a5d6ff;">&#39;CsReferer&#39;</span>].<span style="color: #79c0ff;">count</span>().<span style="color: #79c0ff;">nlargest</span>(<span style="color: #79c0ff;">15</span>).<span style="color: #79c0ff;">to_frame</span>()
</div><div class="line" data-line="5"><span style="color: #e6edf3;">top_ref</span>.<span style="color: #79c0ff;">rename</span>(<span style="color: #e6edf3;">columns</span><span style="color: #79c0ff;">=</span>&lbrace;<span style="color: #a5d6ff;">&#39;CsReferer&#39;</span>:<span style="color: #a5d6ff;">&#39;Cnt&#39;</span>&rbrace;, <span style="color: #e6edf3;">errors</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;raise&#39;</span>, <span style="color: #e6edf3;">inplace</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">True</span>)
</div><div class="line" data-line="6"><span style="color: #e6edf3;">top_ref</span>.<span style="color: #79c0ff;">reset_index</span>(<span style="color: #e6edf3;">level</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">0</span>, <span style="color: #e6edf3;">inplace</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">True</span>)
</div></code></pre>
<p>We can display this with Dash / Plotly as a table.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1">
</div><div class="line" data-line="2"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">generate_table</span>(<span style="color: #e6edf3;">dataframe</span>, <span style="color: #e6edf3;">max_rows</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">15</span>):
</div><div class="line" data-line="3">  <span style="color: #ff7b72;">return</span> <span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Table</span>([
</div><div class="line" data-line="4">    <span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Thead</span>(
</div><div class="line" data-line="5">      <span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Tr</span>([<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Th</span>(<span style="color: #e6edf3;">col</span>) <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">col</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">dataframe</span>.<span style="color: #79c0ff;">columns</span>])
</div><div class="line" data-line="6">    ),
</div><div class="line" data-line="7">    <span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Tbody</span>([
</div><div class="line" data-line="8">      <span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Tr</span>([
</div><div class="line" data-line="9">          <span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Td</span>(<span style="color: #e6edf3;">dataframe</span>.<span style="color: #79c0ff;">iloc</span>[<span style="color: #e6edf3;">i</span>][<span style="color: #e6edf3;">col</span>]) <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">col</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">dataframe</span>.<span style="color: #79c0ff;">columns</span>
</div><div class="line" data-line="10">      ]) <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">i</span> <span style="color: #79c0ff;">in</span> <span style="color: #d2a8ff;">range</span>(<span style="color: #d2a8ff;">min</span>(<span style="color: #d2a8ff;">len</span>(<span style="color: #e6edf3;">dataframe</span>), <span style="color: #e6edf3;">max_rows</span>))
</div><div class="line" data-line="11">    ])
</div><div class="line" data-line="12">  ])
</div><div class="line" data-line="13">
</div><div class="line" data-line="14"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">generate_top_referes</span>():
</div><div class="line" data-line="15">  <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="16">  <span style="color: #e6edf3;">top_refs</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="17">  <span style="color: #e6edf3;">top_refs</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H3</span>(<span style="color: #a5d6ff;">&#39;Top Referers&#39;</span>))
</div><div class="line" data-line="18">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">top_ref</span> <span style="color: #79c0ff;">in</span> <span style="color: #d2a8ff;">get_top_referers</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="19">    <span style="color: #e6edf3;">children</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="20">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H4</span>(<span style="color: #e6edf3;">s3_file_names</span>[<span style="color: #e6edf3;">idx</span>]))
</div><div class="line" data-line="21">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #d2a8ff;">generate_table</span>(<span style="color: #e6edf3;">top_ref</span>))
</div><div class="line" data-line="22">    <span style="color: #e6edf3;">top_refs</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Div</span>(<span style="color: #e6edf3;">children</span>, <span style="color: #e6edf3;">className</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;row&#39;</span>))
</div><div class="line" data-line="23">    <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">+=</span> <span style="color: #79c0ff;">1</span>
</div></code></pre>
<p>This yields a simple table:</p>
<p><img src="/static/img/blog/top_referers_2020.11.png" alt="Top Referers" /></p>
<h3 id="top-posts"><a href="#top-posts">Top Posts</a></h3>
<p>This metric is about the most visited posts. I had to remove the CSS, fonts, etc., urls.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">get_top_posts</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">top_urls</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="3">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">df</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">months_report</span>:
</div><div class="line" data-line="4">    <span style="color: #e6edf3;">df</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df</span>[<span style="color: #e6edf3;">df</span>[<span style="color: #a5d6ff;">&#39;CsUriStreamClean&#39;</span>].<span style="color: #79c0ff;">str</span>.<span style="color: #79c0ff;">contains</span>(<span style="color: #a5d6ff;">&#39;posts&#39;</span>)].<span style="color: #79c0ff;">copy</span>()
</div><div class="line" data-line="5">    <span style="color: #e6edf3;">top_url</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df</span>.<span style="color: #79c0ff;">groupby</span>([<span style="color: #a5d6ff;">&#39;CsUriStreamClean&#39;</span>])[<span style="color: #a5d6ff;">&#39;CsUriStreamClean&#39;</span>].<span style="color: #79c0ff;">count</span>().<span style="color: #79c0ff;">nlargest</span>(<span style="color: #79c0ff;">10</span>).<span style="color: #79c0ff;">to_frame</span>()
</div><div class="line" data-line="6">    <span style="color: #e6edf3;">top_url</span>.<span style="color: #79c0ff;">rename</span>(<span style="color: #e6edf3;">columns</span><span style="color: #79c0ff;">=</span>&lbrace;<span style="color: #a5d6ff;">&#39;CsUriStreamClean&#39;</span>:<span style="color: #a5d6ff;">&#39;Cnt&#39;</span>&rbrace;, <span style="color: #e6edf3;">errors</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;raise&#39;</span>, <span style="color: #e6edf3;">inplace</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">True</span>)
</div><div class="line" data-line="7">    <span style="color: #e6edf3;">top_url</span>.<span style="color: #79c0ff;">reset_index</span>(<span style="color: #e6edf3;">level</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">0</span>, <span style="color: #e6edf3;">inplace</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">True</span>)
</div><div class="line" data-line="8">    <span style="color: #e6edf3;">top_urls</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">top_url</span>)
</div><div class="line" data-line="9">  <span style="color: #ff7b72;">return</span> <span style="color: #e6edf3;">top_urls</span>
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">generate_top_posts</span>():
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">top_posts</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="4">  <span style="color: #e6edf3;">top_posts</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H3</span>(<span style="color: #a5d6ff;">&#39;Top Posts&#39;</span>))
</div><div class="line" data-line="5">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">top_url</span> <span style="color: #79c0ff;">in</span> <span style="color: #d2a8ff;">get_top_posts</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="6">    <span style="color: #e6edf3;">children</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="7">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H4</span>(<span style="color: #e6edf3;">s3_file_names</span>[<span style="color: #e6edf3;">idx</span>]))
</div><div class="line" data-line="8">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #d2a8ff;">generate_table</span>(<span style="color: #e6edf3;">top_url</span>))
</div><div class="line" data-line="9">    <span style="color: #e6edf3;">top_posts</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Div</span>(<span style="color: #e6edf3;">children</span>, <span style="color: #e6edf3;">className</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;row&#39;</span>))
</div><div class="line" data-line="10">    <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">+=</span> <span style="color: #79c0ff;">1</span>
</div></code></pre>
<p>And it looks like this:</p>
<p><img src="/static/img/blog/top_posts_2020.11.png" alt="Top Posts" /></p>
<h3 id="top-iatas"><a href="#top-iatas">Top IATAs</a></h3>
<p>I was also curious a bit about where the readers are from. AWS uses IATA code for naming their pops. This is easy to visualize:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1">
</div><div class="line" data-line="2"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">get_iata_codes</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">top_iata</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="4">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">df</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">months_report</span>:
</div><div class="line" data-line="5">    <span style="color: #e6edf3;">data</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">df</span>.<span style="color: #79c0ff;">groupby</span>([<span style="color: #a5d6ff;">&#39;IATA&#39;</span>])[<span style="color: #a5d6ff;">&#39;hcip&#39;</span>].<span style="color: #79c0ff;">count</span>().<span style="color: #79c0ff;">nlargest</span>(<span style="color: #79c0ff;">30</span>).<span style="color: #79c0ff;">to_frame</span>()
</div><div class="line" data-line="6">    <span style="color: #e6edf3;">fig</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">px</span>.<span style="color: #79c0ff;">bar</span>(<span style="color: #e6edf3;">data</span>, <span style="color: #e6edf3;">x</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">data</span>.<span style="color: #79c0ff;">index</span>, <span style="color: #e6edf3;">y</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">data</span>.<span style="color: #79c0ff;">hcip</span>)
</div><div class="line" data-line="7">    <span style="color: #e6edf3;">top_iata</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">fig</span>)
</div><div class="line" data-line="8">  <span style="color: #ff7b72;">return</span> <span style="color: #e6edf3;">top_iata</span>
</div><div class="line" data-line="9">
</div><div class="line" data-line="10"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">generate_top_iata</span>():
</div><div class="line" data-line="11">  <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="12">  <span style="color: #e6edf3;">top_iatas</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="13">  <span style="color: #e6edf3;">top_iatas</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H3</span>(<span style="color: #a5d6ff;">&#39;Top IATAs&#39;</span>))
</div><div class="line" data-line="14">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">top_iata</span> <span style="color: #79c0ff;">in</span> <span style="color: #d2a8ff;">get_iata_codes</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="15">    <span style="color: #e6edf3;">children</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="16">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H4</span>(<span style="color: #e6edf3;">s3_file_names</span>[<span style="color: #e6edf3;">idx</span>]))
</div><div class="line" data-line="17">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">dcc</span>.<span style="color: #79c0ff;">Graph</span>(
</div><div class="line" data-line="18">        <span style="color: #e6edf3;">id</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;Top IATA codes where readers are &lbrace;&rbrace;&#39;</span>.<span style="color: #79c0ff;">format</span>(<span style="color: #e6edf3;">idx</span>),
</div><div class="line" data-line="19">        <span style="color: #e6edf3;">figure</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">top_iata</span>
</div><div class="line" data-line="20">    ))
</div><div class="line" data-line="21">    <span style="color: #e6edf3;">top_iatas</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Div</span>(<span style="color: #e6edf3;">children</span>, <span style="color: #e6edf3;">className</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;row&#39;</span>))
</div><div class="line" data-line="22">    <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">+=</span> <span style="color: #79c0ff;">1</span>
</div></code></pre>
<p>Using a simple bar chart:</p>
<p><img src="/static/img/blog/top_iatas_2020.11.png" alt="Top Posts" /></p>
<p>Full code is <a href="https://github.com/l1x/s3logs/blob/main/viz/viz.py">here</a>.</p>
<h3 id="hit-distribution-over-time"><a href="#hit-distribution-over-time">Hit distribution over time</a></h3>
<p>For this metric, it would be great to use a heatmap. Luckily Plotly has a highly customizable heatmap that is easy to use.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-python" translate="no" tabindex="0"><div class="line" data-line="1">
</div><div class="line" data-line="2"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">visitor_heatmap</span>(<span style="color: #e6edf3;">df</span>, <span style="color: #e6edf3;">scale</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;lin&#39;</span>):
</div><div class="line" data-line="3">
</div><div class="line" data-line="4">  <span style="color: #e6edf3;">dfa</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">convert_to_time_indexed</span>(<span style="color: #d2a8ff;">generate_aggregates</span>(<span style="color: #e6edf3;">df</span>, [<span style="color: #a5d6ff;">&#39;day&#39;</span>, <span style="color: #a5d6ff;">&#39;hour&#39;</span>], <span style="color: #a5d6ff;">&#39;ScStatus&#39;</span>), <span style="color: #a5d6ff;">&#39;day&#39;</span>)
</div><div class="line" data-line="5">
</div><div class="line" data-line="6">  <span style="color: #e6edf3;">size_lin</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">dfa</span>.<span style="color: #79c0ff;">Cnt</span>.<span style="color: #79c0ff;">values</span>
</div><div class="line" data-line="7">  <span style="color: #e6edf3;">size_log_2</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">np</span>.<span style="color: #79c0ff;">log</span>(<span style="color: #e6edf3;">dfa</span>.<span style="color: #79c0ff;">Cnt</span>.<span style="color: #79c0ff;">values</span>) <span style="color: #79c0ff;">/</span> <span style="color: #e6edf3;">np</span>.<span style="color: #79c0ff;">log</span>(<span style="color: #79c0ff;">2</span>)
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">  <span style="color: #e6edf3;">size</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">size_lin</span> <span style="color: #ff7b72;">if</span> <span style="color: #e6edf3;">scale</span><span style="color: #79c0ff;">==</span><span style="color: #a5d6ff;">&#39;lin&#39;</span> <span style="color: #ff7b72;">else</span> <span style="color: #e6edf3;">size_log_2</span>
</div><div class="line" data-line="10">
</div><div class="line" data-line="11">  <span style="color: #e6edf3;">fig</span> <span style="color: #79c0ff;">=</span> <span style="color: #e6edf3;">go</span>.<span style="color: #79c0ff;">Figure</span>(
</div><div class="line" data-line="12">    <span style="color: #e6edf3;">data</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">go</span>.<span style="color: #79c0ff;">Scattergl</span>(
</div><div class="line" data-line="13">      <span style="color: #e6edf3;">x</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">dfa</span>.<span style="color: #79c0ff;">index</span>,
</div><div class="line" data-line="14">      <span style="color: #e6edf3;">y</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">dfa</span>[<span style="color: #a5d6ff;">&#39;hour&#39;</span>],
</div><div class="line" data-line="15">      <span style="color: #e6edf3;">mode</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;markers&#39;</span>,
</div><div class="line" data-line="16">      <span style="color: #e6edf3;">marker</span><span style="color: #79c0ff;">=</span><span style="color: #d2a8ff;">dict</span>(
</div><div class="line" data-line="17">        <span style="color: #e6edf3;">color</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">dfa</span>.<span style="color: #79c0ff;">Cnt</span>,
</div><div class="line" data-line="18">        <span style="color: #e6edf3;">colorscale</span> <span style="color: #79c0ff;">=</span> <span style="color: #a5d6ff;">&#39;portland&#39;</span>,
</div><div class="line" data-line="19">        <span style="color: #e6edf3;">line_width</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">1</span>,
</div><div class="line" data-line="20">        <span style="color: #e6edf3;">size</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">size</span>,
</div><div class="line" data-line="21">        <span style="color: #e6edf3;">showscale</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">True</span>,
</div><div class="line" data-line="22">        <span style="color: #e6edf3;">sizemin</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">3</span>
</div><div class="line" data-line="23">      )
</div><div class="line" data-line="24">    )
</div><div class="line" data-line="25">  )
</div><div class="line" data-line="26">
</div><div class="line" data-line="27">  <span style="color: #e6edf3;">fig</span>.<span style="color: #79c0ff;">update_layout</span>(
</div><div class="line" data-line="28">    <span style="color: #e6edf3;">height</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">600</span>,
</div><div class="line" data-line="29">    <span style="color: #e6edf3;">title_text</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;Number of request per hour over time&#39;</span>,
</div><div class="line" data-line="30">    <span style="color: #e6edf3;">yaxis_nticks</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">24</span>,
</div><div class="line" data-line="31">    <span style="color: #e6edf3;">xaxis_nticks</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">31</span>
</div><div class="line" data-line="32">  )
</div><div class="line" data-line="33">
</div><div class="line" data-line="34">  <span style="color: #e6edf3;">fig</span>.<span style="color: #79c0ff;">update_yaxes</span>(<span style="color: #e6edf3;">autorange</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&quot;reversed&quot;</span>)
</div><div class="line" data-line="35">  <span style="color: #ff7b72;">return</span> <span style="color: #e6edf3;">fig</span>
</div><div class="line" data-line="36">
</div><div class="line" data-line="37">
</div><div class="line" data-line="38"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">get_visiting_times</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="39">  <span style="color: #e6edf3;">times</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="40">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">df</span> <span style="color: #79c0ff;">in</span> <span style="color: #e6edf3;">months_report</span>:
</div><div class="line" data-line="41">    <span style="color: #e6edf3;">fig</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">visitor_heatmap</span>(<span style="color: #e6edf3;">df</span>, <span style="color: #a5d6ff;">&#39;log&#39;</span>)
</div><div class="line" data-line="42">    <span style="color: #e6edf3;">times</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">fig</span>)
</div><div class="line" data-line="43">  <span style="color: #ff7b72;">return</span> <span style="color: #e6edf3;">times</span>
</div><div class="line" data-line="44">
</div><div class="line" data-line="45"><span style="color: #ff7b72;">def</span> <span style="color: #d2a8ff;">generate_visting_times</span>():
</div><div class="line" data-line="46">  <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">=</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="47">  <span style="color: #e6edf3;">times</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="48">  <span style="color: #e6edf3;">times</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H3</span>(<span style="color: #a5d6ff;">&#39;Visiting Times&#39;</span>))
</div><div class="line" data-line="49">  <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">time_fig</span> <span style="color: #79c0ff;">in</span> <span style="color: #d2a8ff;">get_visiting_times</span>(<span style="color: #e6edf3;">months_report</span>):
</div><div class="line" data-line="50">    <span style="color: #e6edf3;">children</span> <span style="color: #79c0ff;">=</span> []
</div><div class="line" data-line="51">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">H4</span>(<span style="color: #e6edf3;">s3_file_names</span>[<span style="color: #e6edf3;">idx</span>]))
</div><div class="line" data-line="52">    <span style="color: #e6edf3;">children</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">dcc</span>.<span style="color: #79c0ff;">Graph</span>(
</div><div class="line" data-line="53">        <span style="color: #e6edf3;">id</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;Hit distribution over time &lbrace;&rbrace;&#39;</span>.<span style="color: #79c0ff;">format</span>(<span style="color: #e6edf3;">idx</span>),
</div><div class="line" data-line="54">        <span style="color: #e6edf3;">figure</span><span style="color: #79c0ff;">=</span><span style="color: #e6edf3;">time_fig</span>
</div><div class="line" data-line="55">    ))
</div><div class="line" data-line="56">    <span style="color: #e6edf3;">times</span>.<span style="color: #79c0ff;">append</span>(<span style="color: #e6edf3;">html</span>.<span style="color: #79c0ff;">Div</span>(<span style="color: #e6edf3;">children</span>, <span style="color: #e6edf3;">className</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;row&#39;</span>))
</div><div class="line" data-line="57">    <span style="color: #e6edf3;">idx</span> <span style="color: #79c0ff;">+=</span> <span style="color: #79c0ff;">1</span>
</div></code></pre>
<p><img src="/static/img/blog/visiting_times_2020.11.png" alt="Visiting Times" /></p>
<p>Full code is <a href="https://github.com/l1x/s3logs/blob/main/viz/viz.py">here</a>.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 20 Dec 2020 17:23:21 +0100</pubDate>
    </item>
    <item>
      <title>Running ASP.Net web application with Falco on AWS Lambda</title>
      <link>https://dev.l1x.be/posts/2020/12/18/running-asp.net-web-application-with-falco-on-aws-lambda/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/12/18/running-asp.net-web-application-with-falco-on-aws-lambda/</guid>
      <content:encoded><![CDATA[<h2 id="intro"><a href="#intro">Intro</a></h2>
<p>This article has been written by Gabor Gergely (kodfodrasz) as a guest post on this blog. He is the lead engineer of our engineering organization working on F#, AWS and Elm.</p>
<p>We have been using AWS Lambda with F# for a while and have some experience with it. Until now, we opted to use the plain AWS Lambda .Net Runtime provided by Amazon because we value simplicity and code being transparent so we can reason about its operation. This is also why we opted for using F#, as it provides a sweet spot in a simple to reason about functional code and rich features provided by the .Net Framework, all with good performance for our use case.</p>
<p>Still, we have met some pain points with developing for and running on AWS Lambda:</p>
<p>One is that the edit-compile-run loop takes a very long time as it involves deployment to Lambda. Testing and interactive debugging are really slow this way. To overcome this, one needs an emulator environment for the Lambda Runtime. Developing one is additional maintenance overhead, so the lookout for a suitable out-of-the-box solution became a background task in my mind.</p>
<p>While reading comments about the announcement of F# 5.0, I found out about the Falco framework, which is a functional-first Asp.Net Core framework. I wondered whether opting for and Asp.Net hosting in Lambda could provide a setup to allow simple local/CI execution while also providing a mostly similar hosting environment in AWS Lambda... so I started to build an experimental setup to evaluate the idea.</p>
<p>The experiment has the main steps:</p>
<ol>
<li>Getting ASP.Net &amp; Falco running locally with a simple web application</li>
<li>Deploying to AWS Lambda</li>
<li>Using .Net 5.0 in AWS Lambda</li>
</ol>
<h2 id="getting-aspnet-amp-falco-running-locally-with-a-simple-web-application"><a href="#getting-aspnet-amp-falco-running-locally-with-a-simple-web-application">Getting ASP.Net &amp; Falco running locally with a simple web application</a></h2>
<h3 id="introducing-falco"><a href="#introducing-falco">Introducing Falco</a></h3>
<p><a href="https://www.falcoframework.com/">Falco Framework</a> is an F# web application framework, based on ASP.Net Core. It is still a work in progress, but its minimalism and non-intrusive API made it really sympathetic for me. I have checked out some other F# web frameworks, but this is the first, which feels cleaner and more straightforward than only using ASP.Net the same object-oriented way I used it from C# in the past.</p>
<h3 id="hello-world-from-local-falco"><a href="#hello-world-from-local-falco">Hello World from local Falco</a></h3>
<p>First, let's create a simple sample app to test the hosting setup. Let's stick to the <a href="https://www.falcoframework.com/#getting-started">Falco's Getting Started Guide</a>.</p>
<p>Let's create the official HelloWorld application!</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">-i</span> <span style="color: #a5d6ff;">&quot;Falco.Template::*&quot;</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">falco</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">HelloFalco</span>
</div></code></pre>
<p>This installs the Falco templates, then creates a project in the <code>HelloFalco</code> directory based on the <code>falco</code> template just installed.</p>
<p>I usually use Visual Studio, so adding a solution for the project is also a good idea (for me).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">solution</span> <span style="color: #e6edf3;">-n</span> <span style="color: #e6edf3;">HelloFalco</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">sln</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">HelloFalco</span>
</div></code></pre>
<p>Now I can simply open the solution and check the <code>Program.fs</code> file. For now, we simply stick to the contents in the template.</p>
<p>Let's start it! If using Visual Studio choose the HelloWordApp from the run configurations to use the Kestrel based hosting instead of IIS Express.</p>
<p><img src="/static/img/blog/advent_article_part1-run_config.jpg" alt="Where to select that run configuration" /></p>
<p>Now you should see something like this:</p>
<p><img src="/static/img/blog/advent_article_part1-it_runs.jpg" alt="It runs!" /></p>
<h3 id="adding-a-route"><a href="#adding-a-route">Adding a route</a></h3>
<p>Before we move on to the next phase, let's add another – parameterized – route from the Falco examples. This will be used to validate the hosting setup in the next phase. The complete program is still very compact, looks as follows:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">module HelloFalco.Program
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">open Falco
</div><div class="line" data-line="4">open Falco.Routing
</div><div class="line" data-line="5">open Falco.HostBuilder
</div><div class="line" data-line="6">open Microsoft.AspNetCore.Builder
</div><div class="line" data-line="7">open Microsoft.AspNetCore.Hosting
</div><div class="line" data-line="8">open Microsoft.Extensions.DependencyInjection
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">// ------------
</div><div class="line" data-line="11">// Register services
</div><div class="line" data-line="12">// ------------
</div><div class="line" data-line="13">let configureServices (services : IServiceCollection) =
</div><div class="line" data-line="14">    services.AddFalco() |&gt; ignore
</div><div class="line" data-line="15">
</div><div class="line" data-line="16">// ------------
</div><div class="line" data-line="17">// Activate middleware
</div><div class="line" data-line="18">// ------------
</div><div class="line" data-line="19">let configureApp (endpoints : HttpEndpoint list) (ctx : WebHostBuilderContext) (app : IApplicationBuilder) =
</div><div class="line" data-line="20">    let devMode = StringUtils.strEquals ctx.HostingEnvironment.EnvironmentName &quot;Development&quot;
</div><div class="line" data-line="21">    app.UseWhen(devMode, fun app -&gt;
</div><div class="line" data-line="22">            app.UseDeveloperExceptionPage())
</div><div class="line" data-line="23">       .UseWhen(not(devMode), fun app -&gt;
</div><div class="line" data-line="24">            app.UseFalcoExceptionHandler(Response.withStatusCode 500 &gt;&gt; Response.ofPlainText &quot;Server error&quot;))
</div><div class="line" data-line="25">       .UseFalco(endpoints) |&gt; ignore
</div><div class="line" data-line="26">
</div><div class="line" data-line="27">// -----------
</div><div class="line" data-line="28">// Configure Web host
</div><div class="line" data-line="29">// -----------
</div><div class="line" data-line="30">let configureWebHost (endpoints : HttpEndpoint list) (webHost : IWebHostBuilder) =
</div><div class="line" data-line="31">    webHost
</div><div class="line" data-line="32">        .ConfigureServices(configureServices)
</div><div class="line" data-line="33">        .Configure(configureApp endpoints)
</div><div class="line" data-line="34">
</div><div class="line" data-line="35">let helloHandler: HttpHandler =
</div><div class="line" data-line="36">  let getMessage (route: RouteCollectionReader) =
</div><div class="line" data-line="37">    route.GetString &quot;name&quot; &quot;stranger&quot;
</div><div class="line" data-line="38">    |&gt; sprintf &quot;Hello %s!&quot;
</div><div class="line" data-line="39">
</div><div class="line" data-line="40">  Request.mapRoute getMessage Response.ofPlainText
</div><div class="line" data-line="41">
</div><div class="line" data-line="42">[&lt;EntryPoint&gt;]
</div><div class="line" data-line="43">let main args =
</div><div class="line" data-line="44">  webHost args &lbrace;
</div><div class="line" data-line="45">    configure configureWebHost
</div><div class="line" data-line="46">
</div><div class="line" data-line="47">    endpoints [ get &quot;/&quot; (Response.ofPlainText &quot;Hello world&quot;)
</div><div class="line" data-line="48">                get &quot;/hello/&lbrace;name?&rbrace;&quot; helloHandler ]
</div><div class="line" data-line="49">  &rbrace;
</div><div class="line" data-line="50">  0
</div></code></pre>
<p>Its operation can be quickly verified with cURL.</p>
<p><img src="/static/img/advent_article_part1_additional_routes.jpg" alt="Image of locally running service responding to queries as expected" /></p>
<p>It runs and responds as expected. 🎉👏</p>
<p>Ok, nothing extraordinary yet. We managed to start a vanilla demo template then extended it with some other demo code. Time for something more interesting.</p>
<h2 id="deploying-to-aws-lambda"><a href="#deploying-to-aws-lambda">Deploying to AWS Lambda</a></h2>
<p>We have managed to reproduce the official Falco tutorial so far. As mentioned earlier, we are actively using AWS Lambda, so now I will show how to deploy a Falco application to Lambda!</p>
<p>For this, I will be using <a href="https://docs.aws.amazon.com/serverless-application-model/">AWS Serverless Application Model</a>, with its .Net tooling.
I have also used some AWS Lambda function templates as a basis for the configuration files I will present below. All of these can be installed using the dotnet command.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">tool</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">-g</span> <span style="color: #e6edf3;">Amazon.Lambda.Tools</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">new</span> <span style="color: #e6edf3;">-i</span> <span style="color: #a5d6ff;">&quot;Amazon.Lambda.Templates::*&quot;</span>
</div></code></pre>
<p>The template I used is the <code>serverless.AspNetCoreWebAPI</code> one. I interested you can check it out with <code>dotnet new serverless.AspNetCoreWebAPI --language F# --name HelloServerlessAsp</code>.</p>
<h3 id="aws-infrastructure-prerequisites"><a href="#aws-infrastructure-prerequisites">AWS infrastructure prerequisites</a></h3>
<p>Deployment to Lambda using AWS SAM needs a bucket for the storage of the packaged Lambda functions. Assuming your AWS CLI is installed and your credentials are configured, it is a straightforward command to create it. I created the bucket <code>lambda.kodfodrasz.net</code> using my default credentials in the region closest to me.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">aws</span> <span style="color: #e6edf3;">s3api</span> <span style="color: #e6edf3;">create-bucket</span> <span style="color: #e6edf3;">--bucket</span> <span style="color: #e6edf3;">lambda.kodfodrasz.net</span> <span style="color: #e6edf3;">--acl</span> <span style="color: #e6edf3;">private</span> <span style="color: #e6edf3;">--create-bucket-configuration</span> <span style="color: #e6edf3;">LocationConstraint=eu-central-1</span>
</div></code></pre>
<h3 id="configuring-a-lambda-compatible-net-version"><a href="#configuring-a-lambda-compatible-net-version">Configuring a Lambda compatible .Net version</a></h3>
<p>As of writing this article .Net 5.0 has been recently released. However, it is not supported out-of-the-box officially by Amazon, though workarounds exist.
I have .Net 5.0 installed on my machine, which would result in the app targeting this version, unless configured, which would ultimately result in runtime failures at Lambda invocation time.</p>
<p>To prevent the problems, we must adjust the version settings to target the latest supported runtime version in AWS Lambda: .Net Core 3.1.</p>
<p>First, let's retarget the <code>HelloFalco.fsproj</code> project, to contain <code>&lt;TargetFramework&gt;netcoreapp3.1&lt;/TargetFramework&gt;</code>.</p>
<p>Second, a <code>global.json</code> file needs to be added and set up to prevent <a href="https://docs.microsoft.com/en-us/dotnet/core/tools/global-json?tabs=netcore3x#rollforward">automatic version <em>roll forward</em></a>. Add the <code>global.json</code> file to the root of the projects (the solution directory).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;sdk&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">    <span style="color: #79c0ff;">&quot;version&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;3.1.100&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">    <span style="color: #79c0ff;">&quot;rollForward&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;latestMinor&quot;</span>
</div><div class="line" data-line="5">  <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<h3 id="adapt-the-f-project-for-use-with-aws-sam-tools"><a href="#adapt-the-f-project-for-use-with-aws-sam-tools">Adapt the F# project for use with AWS SAM tools</a></h3>
<p>The Amazon Serverless templates have some special fields set up in their project file. We should add these to ensure Amazon tools handle the project properly.</p>
<p>Also, we will need an additional NuGet package, an adapter between the Lambda runtime and ASP.Net. You could add it using the command <code>dotnet add package Amazon.Lambda.AspNetCoreServer</code>, but I simply tuned the <code>HelloFalco.fsproj</code> to include these changes mentioned.</p>
<p>However, this is not enough. We need to add the AWS Lambda facette to the Falco project file <code>HelloFalco.fsproj</code> as well:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">&lt;Project Sdk=&quot;Microsoft.NET.Sdk.Web&quot;&gt;
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">  &lt;PropertyGroup&gt;
</div><div class="line" data-line="4">    &lt;TargetFramework&gt;netcoreapp3.1&lt;/TargetFramework&gt;
</div><div class="line" data-line="5">    &lt;!-- The next three properties were copied from the AWS Serverless template to ensure tooling compatibility (along with the next comment). --&gt;
</div><div class="line" data-line="6">    &lt;GenerateRuntimeConfigurationFiles&gt;true&lt;/GenerateRuntimeConfigurationFiles&gt;
</div><div class="line" data-line="7">    &lt;AWSProjectType&gt;Lambda&lt;/AWSProjectType&gt;
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">    &lt;!-- This property makes the build directory similar to a publish directory and helps the AWS .NET Lambda Mock Test Tool find project dependencies. --&gt;
</div><div class="line" data-line="10">    &lt;CopyLocalLockFileAssemblies&gt;true&lt;/CopyLocalLockFileAssemblies&gt;
</div><div class="line" data-line="11">  &lt;/PropertyGroup&gt;
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">  &lt;ItemGroup&gt;
</div><div class="line" data-line="14">    &lt;Compile Include=&quot;Program.fs&quot; /&gt;
</div><div class="line" data-line="15">  &lt;/ItemGroup&gt;
</div><div class="line" data-line="16">
</div><div class="line" data-line="17">  &lt;ItemGroup&gt;
</div><div class="line" data-line="18">    &lt;PackageReference Include=&quot;Falco&quot; Version=&quot;3.0.*&quot; /&gt;
</div><div class="line" data-line="19">    &lt;!-- Contains the adapter between AWS Lambda Runtime and ASP.Net Core --&gt;
</div><div class="line" data-line="20">    &lt;PackageReference Include=&quot;Amazon.Lambda.AspNetCoreServer&quot; Version=&quot;5.2.0&quot; /&gt;
</div><div class="line" data-line="21">  &lt;/ItemGroup&gt;
</div><div class="line" data-line="22">&lt;/Project&gt;
</div></code></pre>
<h3 id="modify-the-program-for-lambda-compatibility"><a href="#modify-the-program-for-lambda-compatibility">Modify the program for Lambda compatibility</a></h3>
<p>At this point, the program can be compiled and packaged for deployment to AWS. Before finalizing the other deployment configuration files, we need to change the program and support being hosted locally by Kestrel and in the cloud by the Lambda Runtime.</p>
<p>The modification needed:</p>
<ul>
<li>Move the endpoints to an outer scope, to be reachable from both hosting setup code.
Both setups should use this.</li>
<li>Add a hosting setup for Lambda, and use the same configuration code as the local setup.</li>
</ul>
<p>The modified code is still pretty compact and straightforward (but I added some additional comments for easy navigability):</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">module HelloFalco.Program
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">open Falco
</div><div class="line" data-line="4">open Falco.Routing
</div><div class="line" data-line="5">open Falco.HostBuilder
</div><div class="line" data-line="6">open Microsoft.AspNetCore.Builder
</div><div class="line" data-line="7">open Microsoft.AspNetCore.Hosting
</div><div class="line" data-line="8">open Microsoft.Extensions.DependencyInjection
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">// =============
</div><div class="line" data-line="11">// The endpoints
</div><div class="line" data-line="12">// =============
</div><div class="line" data-line="13">let helloHandler: HttpHandler =
</div><div class="line" data-line="14">  let getMessage (route: RouteCollectionReader) =
</div><div class="line" data-line="15">    route.GetString &quot;name&quot; &quot;stranger&quot;
</div><div class="line" data-line="16">    |&gt; sprintf &quot;Hello %s!&quot;
</div><div class="line" data-line="17">
</div><div class="line" data-line="18">  Request.mapRoute getMessage Response.ofPlainText
</div><div class="line" data-line="19">
</div><div class="line" data-line="20">let endpointList =
</div><div class="line" data-line="21">  [ get &quot;/&quot; (Response.ofPlainText &quot;Hello world&quot;)
</div><div class="line" data-line="22">    get &quot;/hello/&lbrace;name?&rbrace;&quot; helloHandler ]
</div><div class="line" data-line="23">
</div><div class="line" data-line="24">// ===========================
</div><div class="line" data-line="25">// Common initializaition code
</div><div class="line" data-line="26">// ===========================
</div><div class="line" data-line="27">
</div><div class="line" data-line="28">// ------------
</div><div class="line" data-line="29">// Register services
</div><div class="line" data-line="30">// ------------
</div><div class="line" data-line="31">let configureServices (services: IServiceCollection) = services.AddFalco() |&gt; ignore
</div><div class="line" data-line="32">
</div><div class="line" data-line="33">// ------------
</div><div class="line" data-line="34">// Activate middleware
</div><div class="line" data-line="35">// ------------
</div><div class="line" data-line="36">let configureApp (endpoints: HttpEndpoint list) (ctx: WebHostBuilderContext) (app: IApplicationBuilder) =
</div><div class="line" data-line="37">  let devMode =
</div><div class="line" data-line="38">    StringUtils.strEquals ctx.HostingEnvironment.EnvironmentName &quot;Development&quot;
</div><div class="line" data-line="39">
</div><div class="line" data-line="40">  app.UseWhen(devMode, (fun app -&gt; app.UseDeveloperExceptionPage()))
</div><div class="line" data-line="41">     .UseWhen(not (devMode),
</div><div class="line" data-line="42">              (fun app -&gt;
</div><div class="line" data-line="43">                app.UseFalcoExceptionHandler
</div><div class="line" data-line="44">                  (Response.withStatusCode 500
</div><div class="line" data-line="45">                   &gt;&gt; Response.ofPlainText &quot;Server error&quot;))).UseHttpsRedirection().UseFalco(endpoints)
</div><div class="line" data-line="46">  |&gt; ignore
</div><div class="line" data-line="47">
</div><div class="line" data-line="48">// =======================
</div><div class="line" data-line="49">// AWS Lambda Startup code
</div><div class="line" data-line="50">// =======================
</div><div class="line" data-line="51">
</div><div class="line" data-line="52">// Lambda entry point
</div><div class="line" data-line="53">type LambdaEntryPoint() =
</div><div class="line" data-line="54">
</div><div class="line" data-line="55">  // The base class must be set to match the AWS service invoking the Lambda function. If not Amazon.Lambda.AspNetCoreServer
</div><div class="line" data-line="56">  // will fail to convert the incoming request correctly into a valid ASP.NET Core request.
</div><div class="line" data-line="57">  //
</div><div class="line" data-line="58">  // API Gateway REST API                         -&gt; Amazon.Lambda.AspNetCoreServer.APIGatewayProxyFunction
</div><div class="line" data-line="59">  // API Gateway HTTP API payload version 1.0     -&gt; Amazon.Lambda.AspNetCoreServer.APIGatewayProxyFunction
</div><div class="line" data-line="60">  // API Gateway HTTP API payload version 2.0     -&gt; Amazon.Lambda.AspNetCoreServer.APIGatewayHttpApiV2ProxyFunction
</div><div class="line" data-line="61">  // Application Load Balancer                    -&gt; Amazon.Lambda.AspNetCoreServer.ApplicationLoadBalancerFunction
</div><div class="line" data-line="62">  //
</div><div class="line" data-line="63">  // Note: When using the AWS::Serverless::Function resource with an event type of &quot;HttpApi&quot; then payload version 2.0
</div><div class="line" data-line="64">  // will be the default and you must make Amazon.Lambda.AspNetCoreServer.APIGatewayHttpApiV2ProxyFunction the base class.
</div><div class="line" data-line="65">  inherit Amazon.Lambda.AspNetCoreServer.APIGatewayProxyFunction()
</div><div class="line" data-line="66">
</div><div class="line" data-line="67">  override this.Init(builder: IWebHostBuilder) =
</div><div class="line" data-line="68">    builder
</div><div class="line" data-line="69">      .ConfigureServices(configureServices)
</div><div class="line" data-line="70">      .Configure(configureApp endpointList)
</div><div class="line" data-line="71">    |&gt; ignore
</div><div class="line" data-line="72">
</div><div class="line" data-line="73">
</div><div class="line" data-line="74">// -----------
</div><div class="line" data-line="75">// Configure Web host
</div><div class="line" data-line="76">// -----------
</div><div class="line" data-line="77">let configureWebHost (endpoints: HttpEndpoint list) (webHost: IWebHostBuilder) =
</div><div class="line" data-line="78">  webHost
</div><div class="line" data-line="79">    .ConfigureServices(configureServices)
</div><div class="line" data-line="80">    .Configure(configureApp endpoints)
</div><div class="line" data-line="81">
</div><div class="line" data-line="82">// ==========================
</div><div class="line" data-line="83">// Local Kestrel startup code
</div><div class="line" data-line="84">// ==========================
</div><div class="line" data-line="85">
</div><div class="line" data-line="86">// Local execution entry point
</div><div class="line" data-line="87">[&lt;EntryPoint&gt;]
</div><div class="line" data-line="88">let main args =
</div><div class="line" data-line="89">  webHost args &lbrace;
</div><div class="line" data-line="90">    configure configureWebHost
</div><div class="line" data-line="91">    endpoints endpointList
</div><div class="line" data-line="92">  &rbrace;
</div><div class="line" data-line="93">  0
</div></code></pre>
<p>One thing to note is the common <code>UseStartup&lt;Startup&gt;()</code> pattern is not used here. I initially tried that pattern, but it didn't fit seamlessly with the initialization model of Falco. The way I finally settled with is pretty straightforward and convenient, and totally native from both Falco, and vanilla ASP.Net Core.</p>
<p>We can try this code locally, and it will still work just as before.</p>
<h3 id="setting-up-the-deployment-configuration"><a href="#setting-up-the-deployment-configuration">Setting up the deployment configuration</a></h3>
<p>The final step before deployment is to define the deployment configuration for the SAM tools. We need to add two config files to the project directory.</p>
<p>First, add <code>aws-lambda-tools-defaults.json</code>. This file defines some defaults for the <code>dotnet lambda</code> command family. The deployment bucket created earlier is needed here. I specified my bucket, <code>lambda.kodfodrasz.net</code>. Also, an <em>S3 prefix</em> is needed, where the data related to this app will be stored. I simply used <code>HelloFalco</code>.</p>
<p>Also, we need to define a <em>stack name</em>, which is used to group the resources related to the Lambda function, for example, the API gateway settings. I chose the name <code>HelloFalco</code>.
If you don't want to use your default profile and region, you can also do so. Otherwise, leave the fields as empty strings.</p>
<p>Overall my config <code>aws-lambda-tools-defaults.json</code> looks like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;Information&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="3">    <span style="color: #a5d6ff;">&quot;This file provides default values for the deployment wizard inside Visual Studio and the AWS Lambda commands added to the .NET Core CLI.&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">    <span style="color: #a5d6ff;">&quot;To learn more about the Lambda commands with the .NET Core CLI execute the following command at the command line in the project root directory.&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">    <span style="color: #a5d6ff;">&quot;dotnet lambda help&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">    <span style="color: #a5d6ff;">&quot;All the command line options for the Lambda command can be specified in this file.&quot;</span>
</div><div class="line" data-line="7">  <span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">  <span style="color: #79c0ff;">&quot;profile&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">  <span style="color: #79c0ff;">&quot;region&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="10">  <span style="color: #79c0ff;">&quot;configuration&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Release&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11">  <span style="color: #79c0ff;">&quot;framework&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;netcoreapp3.1&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="12">  <span style="color: #79c0ff;">&quot;s3-prefix&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;HelloFalco&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="13">  <span style="color: #79c0ff;">&quot;template&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;serverless.template&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="14">  <span style="color: #79c0ff;">&quot;template-parameters&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="15">  <span style="color: #79c0ff;">&quot;stack-name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;HelloFalco&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="16">  <span style="color: #79c0ff;">&quot;s3-bucket&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;lambda.kodfodrasz.net&quot;</span>
</div><div class="line" data-line="17"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>This file references a <em>template</em>, <code>serverless.template</code>. That will be the second config file we will need to create.
I copied this file over from the <code>serverless.AspNetCoreWebAPI</code> project template, and have tweaked it a bit.</p>
<p>The most important field is the <code>Handler</code> item, the entry point where the AWS Lambda Runtime invokes our code. I have the following structure:</p>
<p><code>&lt;assembly name&gt;::&lt;handler class fully qualified name&gt;::&lt;handler method&gt;</code>. In our case:</p>
<ul>
<li>The assembly name is <code>HelloFalco</code></li>
<li>The handler class FQN is <code>HelloFalco.Program+LambdaEntryPoint</code>. This means that it is an inner class (<code>LambdaEntryPoint</code>) of the (static) class <code>Program</code> in the <code>HelloFalco</code> namespace.
If you need to know more about F# and plain CLR interop, I suggest the great reads <a href="https://fsharpforfunandprofit.com/posts/classes/#tip-defining-classes-for-use-by-other-net-code">F# for Fun and profit about Classes</a> by <a href="https://twitter.com/ScottWlaschin">Scott Wlaschin</a> and <a href="https://connelhooley.uk/blog/2017/04/30/f-sharp-to-c-sharp">Calling F# Code in a C# Project</a> by <a href="https://twitter.com/connel_dev">Connel Hooley</a>.</li>
</ul>
<p>The other important part is <code>Events</code>, which sets up the API Gateway events to be forwarded to the app. This setup basically forwards everything, and all routing is up to Falco/ASP.Net.
This is the main difference from our current Lambda setup (with regards to routing), as we are using the API Gateway for route part matching, and are simply using the matched parts from the Lambda Event in the raw Lambda setup we currently have.</p>
<p>The <code>Policies</code> point is also important, it contains the IAM Policies applied to the web application. In this example I replaced the original template's <code>AWSLambdaFullAccess</code> with an almost completely constrained <code>AWSLambdaBasicExecutionRole</code>, which should still suffice for this example.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;AWSTemplateFormatVersion&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;2010-09-09&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">&quot;Transform&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;AWS::Serverless-2016-10-31&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">  <span style="color: #79c0ff;">&quot;Description&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;An AWS Serverless Application that uses the ASP.NET Core framework running in Amazon Lambda.&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">  <span style="color: #79c0ff;">&quot;Parameters&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">  <span style="color: #79c0ff;">&quot;Resources&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="7">    <span style="color: #79c0ff;">&quot;AspNetCoreFunction&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="8">      <span style="color: #79c0ff;">&quot;Type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;AWS::Serverless::Function&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="9">      <span style="color: #79c0ff;">&quot;Properties&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="10">        <span style="color: #79c0ff;">&quot;Handler&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;HelloFalco::HelloFalco.Program+LambdaEntryPoint::FunctionHandlerAsync&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11">        <span style="color: #79c0ff;">&quot;Runtime&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;dotnetcore3.1&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="12">        <span style="color: #79c0ff;">&quot;CodeUri&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="13">        <span style="color: #79c0ff;">&quot;MemorySize&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">256</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="14">        <span style="color: #79c0ff;">&quot;Timeout&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">10</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="15">        <span style="color: #79c0ff;">&quot;Role&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">null</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="16">        <span style="color: #79c0ff;">&quot;Policies&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;AWSLambdaBasicExecutionRole&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="17">        <span style="color: #79c0ff;">&quot;Environment&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="18">          <span style="color: #79c0ff;">&quot;Variables&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="19">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="20">        <span style="color: #79c0ff;">&quot;Events&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="21">          <span style="color: #79c0ff;">&quot;ProxyResource&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="22">            <span style="color: #79c0ff;">&quot;Type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Api&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="23">            <span style="color: #79c0ff;">&quot;Properties&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="24">              <span style="color: #79c0ff;">&quot;Path&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;/&lbrace;proxy+&rbrace;&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="25">              <span style="color: #79c0ff;">&quot;Method&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;ANY&quot;</span>
</div><div class="line" data-line="26">            <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="27">          <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="28">          <span style="color: #79c0ff;">&quot;RootResource&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="29">            <span style="color: #79c0ff;">&quot;Type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Api&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="30">            <span style="color: #79c0ff;">&quot;Properties&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="31">              <span style="color: #79c0ff;">&quot;Path&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;/&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="32">              <span style="color: #79c0ff;">&quot;Method&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;ANY&quot;</span>
</div><div class="line" data-line="33">            <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="34">          <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="35">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="36">      <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="37">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="38">  <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="39">  <span style="color: #79c0ff;">&quot;Outputs&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="40">    <span style="color: #79c0ff;">&quot;ApiURL&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="41">      <span style="color: #79c0ff;">&quot;Description&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;API endpoint URL for Prod environment&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="42">      <span style="color: #79c0ff;">&quot;Value&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="43">        <span style="color: #79c0ff;">&quot;Fn::Sub&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;https://$&lbrace;ServerlessRestApi&rbrace;.execute-api.$&lbrace;AWS::Region&rbrace;.amazonaws.com/Prod/&quot;</span>
</div><div class="line" data-line="44">      <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="45">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="46">  <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="47"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>We can try to deploy out setup now.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">cd</span> <span style="color: #e6edf3;">HelloFalco</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">dotnet</span> <span style="color: #e6edf3;">lambda</span> <span style="color: #e6edf3;">deploy-serverless</span>
</div></code></pre>
<p>If we cURL onto the path displayed at the end of the successful deployment, we can see that Falco does indeed work in AWS Lambda.</p>
<p><img src="/static/img/blog/advent_article_part2-hello_serverless.jpg" alt="Falco responding successfully from AWS Lambda" /></p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2>
<p>In the example above we have seen how to adapt a Falco based ASP.Net Core web application for hosting in AWS Lambda. AWS Lambda is not fit for every usecase, but if suits your usecase, it can be a cheapest way to host a low traffic web application, with minimal maintenance burden, and fine grained security.</p>
<p>I have evaluated the maintainability effects of supporting both local/self-hosted and AWS Lambda hosted execution models at the same time, and I came to the conclusion that it has a minimal overhead. This has the potential to improve the testability of our services in the partially-integrated setups in the CI pipeline, with minimal programmer effort. Also this could help to more easily track down some types of problems by executing the app in local debugger, with confidence in the highly similar nature of the request/response processing pipeline.</p>
<p>Given Falco's simple, readable API and the ease to use even if we will not change to it in every F# based Lambda service we have, I will definitely use it for projects where possible.</p>
<p>If you are interested in trying out the setup outlined without reproducing it step-by-step, then check out the <a href="https://github.com/kodfodrasz/fsharp-advent-calendar-article-2020-12-18/">repository containing the code and the article at GitHub</a>.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 18 Dec 2020 13:25:21 +0100</pubDate>
    </item>
    <item>
      <title>Diving into Firecracker with Alpine</title>
      <link>https://dev.l1x.be/posts/2020/12/13/diving-into-firecracker-with-alpine/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/12/13/diving-into-firecracker-with-alpine/</guid>
      <content:encoded><![CDATA[<h2 id="article-series"><a href="#article-series">Article series</a></h2>
<ul>
<li>1st part :: <a href="https://dev.l1x.be/posts/2020/11/22/getting-started-with-firecracker-on-raspberry-pi/">https://dev.l1x.be/posts/2020/11/22/getting-started-with-firecracker-on-raspberry-pi/</a></li>
<li>2nd part :: this</li>
</ul>
<h2 id="intro"><a href="#intro">Intro</a></h2>
<p>Last time in the 1st article I briefly introduced Firecracker as a lightweight virtualization/containerization solution for extreme-scale (like AWS Lambda functions). This time around I am going to dig a bit deeper into the API and the management of microVMs. I am going to install Alpine on RPI, install Rust and Python, Docker, get the Linux kernel source and compile a new kernel with minimal config, compile our own Firecracker and then create a new rootfs to be able to boot up a guest. Most of these steps are optional, you can use the stock kernel the Firecracker team provides or download Firecracker release from Github.</p>
<h2 id="setup-alpine"><a href="#setup-alpine">Setup Alpine</a></h2>
<p>If you do not care about Alpine on RPI you can jump to the Firecracker section.</p>
<p>I would like to keep going with <a href="https://amzn.to/2Klb9fx">Raspberry Pi 4B 8GB</a> or <a href="https://amzn.to/2KeohTO">Raspberry Pi 4B 4GB</a> for many reasons. It is a small system that you can easily hack on without any change on your desktop. It is also an ARM64 (ARM Cortex-A72) system that has great performance even without active cooling. I usually use it with a <a href="https://amzn.to/3naOS2L">alu case</a> that provides the best heat dispersion and a cool CPU. It has enough CPU power and memory to compile any software including Firecracker, the Linux kernel, and more. Since this project is a side project I don't care how long it takes to finish a new kernel, usually finishes within 2 hours (I might get exact timing later).</p>
<p>Another item on my to-do list is to get Alpine Linux as both the host and the guest system. For those who do not know Alpine is a small Linux distribution designed for security, simplicity, and resource efficiency. It comes with sane defaults and Musl as its C standard library. Alpine uses its own package management system, apk-tools, providing super-fast package installation. Alpine allows a very small system with minimal installation being around 130 MB. The init system is the lightweight OpenRC, Alpine does not use systemd. This was the primary reason I wanted to get into Alpine.</p>
<h3 id="installing-alpine-on-rpi-4"><a href="#installing-alpine-on-rpi-4">Installing Alpine on RPI 4</a></h3>
<p>This is the most complicated part of the setup because RPI has a special boot procedure that uses a FAT partition and the GPU. When installing Alpine first you need to create a FAT partition at the beginning of the SD card with MBR. I am using MacOS this time. I am pretty sure it is easy to translate this to Linux (not sure about Windows.)</p>
<h4 id="creating-the-partition"><a href="#creating-the-partition">Creating the partition</a></h4>
<p>My microSD card is /dev/disk6. I create a partition with the name ALP (1024MB), then activate it with fdisk.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">diskutil</span> <span style="color: #e6edf3;">list</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">diskutil</span> <span style="color: #e6edf3;">partitionDisk</span> <span style="color: #e6edf3;">/dev/disk6</span> <span style="color: #e6edf3;">MBR</span> <span style="color: #a5d6ff;">&quot;FAT32&quot;</span> <span style="color: #e6edf3;">ALP</span> <span style="color: #e6edf3;">1024MB</span> <span style="color: #a5d6ff;">&quot;Free Space&quot;</span> <span style="color: #e6edf3;">SYS</span> <span style="color: #e6edf3;">R</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">fdisk</span> <span style="color: #e6edf3;">-e</span> <span style="color: #e6edf3;">/dev/disk6</span>
</div><div class="line" data-line="4"><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">f</span> <span style="color: #79c0ff;">1</span>
</div><div class="line" data-line="5"><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">w</span>
</div><div class="line" data-line="6"><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">exit</span>
</div></code></pre>
<p>After this command runs successfully MacOS mounts the newly created partition in /Volumes/ALP.</p>
<h4 id="downloading-alpine-and-writing-it-to-the-sd-card"><a href="#downloading-alpine-and-writing-it-to-the-sd-card">Downloading Alpine and writing it to the SD card</a></h4>
<p>You can initiate the download anywhere, make sure the previously created partition is mounted.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">http://dl-cdn.alpinelinux.org/alpine/v3.12/releases/aarch64/alpine-rpi-3.12.1-aarch64.tar.gz</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">tar</span> <span style="color: #e6edf3;">xzvf</span> <span style="color: #e6edf3;">alpine-rpi-3.12.1-aarch64.tar.gz</span> <span style="color: #e6edf3;">-C</span> <span style="color: #e6edf3;">/Volumes/ALP/</span>
</div></code></pre>
<h4 id="configuring-rpi-boot"><a href="#configuring-rpi-boot">Configuring RPI boot</a></h4>
<p>This part is optional, you can disable audio, wifi, Bluetooth, etc and enable UART, configure GPU mem. The full documentation is here:</p>
<p><a href="https://www.raspberrypi.org/documentation/configuration/config-txt/">https://www.raspberrypi.org/documentation/configuration/config-txt/</a></p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">cd</span> <span style="color: #e6edf3;">/Volumes/ALP/</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;dtparam=audio=off&#39;</span>          <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">usercfg.txt</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;dtoverlay=pi3-disable-wifi&#39;</span> <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">usercfg.txt</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;enable_uart=1&#39;</span>              <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">usercfg.txt</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;gpu_mem=64&#39;</span>                 <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">usercfg.txt</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;disable_overscan=1&#39;</span>         <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">usercfg.txt</span>
</div></code></pre>
<p>You are ready to remove the SD card.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">cd</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">diskutil</span> <span style="color: #e6edf3;">eject</span> <span style="color: #e6edf3;">/dev/disk6</span>
</div></code></pre>
<h4 id="booting-and-configuring-alpine"><a href="#booting-and-configuring-alpine">Booting and configuring Alpine</a></h4>
<p>After inserting the SD card into the RPI you can boot it up. I am using a special converter that converts the mini HDMI to a normal HDMI <a href="https://amzn.to/3452Ziz">converter</a> that makes it easy to connect a TV or a monitor to the PI. I usually connect the device to the network with an ethernet cable and plug in a wired USB keyboard.</p>
<p>Once the device is booting up you can login with root (no password).</p>
<p>Alpine has a neat tool to configure a new system. It asks a few questions about keyboard layout and timezone, also makes you create a root password.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">setup-alpine</span>
</div></code></pre>
<p>Once setup-alpine is done you need to change a few things around because up to this moment you operated on the FAT partition. After updating the system and adding cfdisk you can create a new partition and use the remaining space on the SD card to have a proper system. In cfdisk, select “Free space” and the option “New”. It suggests using the entire available space, just press enter, then select the option “primary”, followed by “Write”. Type “yes” to write the partition table to disk, then select “Quit”.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">apk</span> <span style="color: #e6edf3;">update</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">apk</span> <span style="color: #e6edf3;">upgrade</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">cfdisk</span> <span style="color: #e6edf3;">e2fsprogs</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">cfdisk</span> <span style="color: #e6edf3;">/dev/mmcblk0</span>
</div></code></pre>
<p>Once our new partition is ready you need to create a filesystem on it and install a basic Alpine system with setup-disk. In &quot;sys&quot; mode, it's an installer, it permanently installs Alpine on the disk. Ignore the errors, there might be some while executing setup-disk.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">mkfs.ext4</span> <span style="color: #e6edf3;">/dev/mmcblk0p2</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">mount</span> <span style="color: #e6edf3;">/dev/mmcblk0p2</span> <span style="color: #e6edf3;">/mnt</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">setup-disk</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">sys</span> <span style="color: #e6edf3;">/mnt</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">mount</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">remount,rw</span> <span style="color: #e6edf3;">/media/mmcblk0p1</span>
</div></code></pre>
<p>This section is what I found on the Alpine wiki and it works. There might be an easier way.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">rm</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">/media/mmcblk0p1/boot/*</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">cd</span> <span style="color: #e6edf3;">/mnt</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">rm</span> <span style="color: #e6edf3;">boot/boot</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">mv</span> <span style="color: #e6edf3;">boot/*</span> <span style="color: #e6edf3;">/media/mmcblk0p1/boot/</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">rm</span> <span style="color: #e6edf3;">-Rf</span> <span style="color: #e6edf3;">boot</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">mkdir</span> <span style="color: #e6edf3;">media/mmcblk0p1</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">ln</span> <span style="color: #e6edf3;">-s</span> <span style="color: #e6edf3;">media/mmcblk0p1/boot</span> <span style="color: #e6edf3;">boot</span>
</div></code></pre>
<p>There are only two steps left, adjusting fstab and cmdline.txt.</p>
<p>Fstab:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #79c0ff;">UUID</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">your-uui-id</span>  <span style="color: #d2a8ff;">/</span>                 <span style="color: #e6edf3;">ext4</span>  <span style="color: #e6edf3;">rw,relatime</span>  <span style="color: #79c0ff;">0</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">/dev/mmcblk0p1</span>    <span style="color: #e6edf3;">/media/mmcblk0p1</span>  <span style="color: #e6edf3;">vfat</span>  <span style="color: #e6edf3;">rw</span>           <span style="color: #79c0ff;">0</span> <span style="color: #79c0ff;">0</span>
</div></code></pre>
<p>Add the following content to etc/fstab (please note no starting /).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">vi</span> <span style="color: #e6edf3;">etc/fstab</span>
</div></code></pre>
<p>Appending the following to the cmdline.txt:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">root</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">/dev/mmcblk0p2</span>
</div></code></pre>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">vi</span> <span style="color: #e6edf3;">/media/mmcblk0p1/cmdline.txt</span>
</div></code></pre>
<p>It looks like this for me:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">cat</span> <span style="color: #e6edf3;">/media/mmcblk0p1/cmdline.txt</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">modules</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">loop,squashfs,sd-mod,usb-storage</span> <span style="color: #d2a8ff;">quiet</span> <span style="color: #e6edf3;">console=tty1</span> <span style="color: #e6edf3;">root=/dev/mmcblk0p2</span>
</div></code></pre>
<p>I am not sure if lbu commit is necessary here. When Alpine Linux boots in diskless mode, initially it only loads a few required packages from the boot device by default. But local adjustments in RAM are possible, e.g. by installing a package or adjusting some configuration.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">lbu</span> <span style="color: #e6edf3;">commit</span> <span style="color: #e6edf3;">-d</span>
</div></code></pre>
<p>You can reboot and log in with root and verify everything is working. Going forward it is best to have a user other than root.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">adduser</span> <span style="color: #e6edf3;">l1x</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">addgroup</span> <span style="color: #e6edf3;">l1x</span> <span style="color: #e6edf3;">wheel</span>
</div></code></pre>
<p>I usually add the following packages and start to use sudo going forward:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">tmux</span> <span style="color: #e6edf3;">fish</span> <span style="color: #e6edf3;">ninja</span> <span style="color: #e6edf3;">clang</span> <span style="color: #e6edf3;">g++</span> <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">git</span> <span style="color: #e6edf3;">python3</span> <span style="color: #e6edf3;">socat</span> <span style="color: #e6edf3;">curl</span> <span style="color: #e6edf3;">vim</span> <span style="color: #e6edf3;">procps</span>
</div></code></pre>
<p>Make sure that wheel group can use sudo:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">%wheel</span> ALL=<span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">ALL</span><span style="color: #e6edf3;">)</span> <span style="color: #d2a8ff;">NOPASSWD:</span> <span style="color: #e6edf3;">ALL</span>
</div></code></pre>
<p>Changing my shell to fish:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">l1x:x:1000:1000:Linux</span> <span style="color: #e6edf3;">User,,,:/home/l1x:/usr/bin/fish</span>
</div></code></pre>
<p>Hopefully, by now you have a working environment. I use a bigger drive for /data where I store all the development folders.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">l1x</span><span style="color: #d2a8ff;">@alpine</span> <span style="color: #e6edf3;">~</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">mount</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">column</span> <span style="color: #e6edf3;">-t</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">egrep</span> <span style="color: #a5d6ff;">&#39;^/dev&#39;</span>
</div><div class="line" data-line="2">/dev/mmcblk0p2  on  /                 type  ext4        <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">rw,relatime</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">/dev/mmcblk0p1</span>  on  /media/mmcblk0p1  type  vfat        <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">rw,relatime,fmask=0022,dmask=0022,codepage=437,...</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">/dev/sda1</span>       on  /data             type  xfs         <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,...</span><span style="color: #e6edf3;">)</span><span style="color: #79c0ff;"></span>
</div></code></pre>
<h2 id="setting-up-firecracker-dev-environment"><a href="#setting-up-firecracker-dev-environment">Setting up Firecracker dev environment</a></h2>
<p>Once you logged in via SSH to your Alpine system make sure you have the dev tools you are going to need. I usually use the following tools, and some more:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">tmux</span> <span style="color: #e6edf3;">git</span> <span style="color: #e6edf3;">python3</span> <span style="color: #e6edf3;">curl</span> <span style="color: #e6edf3;">vim</span>
</div></code></pre>
<p>I was trying to figure out how to install Rust on ARM64 linux and the most straightforward way looks like:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">curl</span> <span style="color: #e6edf3;">--proto</span> <span style="color: #a5d6ff;">&#39;=https&#39;</span> <span style="color: #e6edf3;">--tlsv1.2</span> <span style="color: #e6edf3;">-sSf</span> <span style="color: #e6edf3;">https://sh.rustup.rs</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">sh</span>
</div></code></pre>
<p>If you want to compile Firecracker yourself you also need Docker. Docker for this version of Alpine lives in the community repo. Simply append the community line to your repositories:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">cat</span> <span style="color: #e6edf3;">/etc/apk/repositories</span>
</div><div class="line" data-line="2"><span style="color: #8b949e;">#/media/mmcblk0p1/apks</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">http://your.nearest.mirror/mirrors/pub/alpine/v3.12/main</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">http://your.nearest.mirror/mirrors/pub/alpine/v3.12/community</span>
</div></code></pre>
<p>Installing Docker:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">docker</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">addgroup</span> <span style="color: #e6edf3;">$</span><span style="color: #79c0ff;">USER</span> <span style="color: #e6edf3;">docker</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">rc-update</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">boot</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">service</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">start</span>
</div></code></pre>
<p>After Docker is running you can clone the Firecracker repo:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">git</span> <span style="color: #e6edf3;">clone</span> <span style="color: #e6edf3;">git@github.com:firecracker-microvm/firecracker.git</span>
</div></code></pre>
<p>Before you can compile a release you need to install two more packages:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">bash</span> <span style="color: #e6edf3;">ncurses</span>
</div></code></pre>
<p>Now you can compile the Firecracker binaries:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./tools/devtool</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">--release</span> <span style="color: #e6edf3;">--libc</span> <span style="color: #e6edf3;">musl</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">[</span><span style="color: #d2a8ff;">Firecracker</span> <span style="color: #e6edf3;">devtool</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">About</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">pull</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">image</span> <span style="color: #e6edf3;">fcuvm/dev:v24</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">Firecracker</span> <span style="color: #e6edf3;">devtool</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">Continue?</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">y/n</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">y</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">Digest:</span> <span style="color: #e6edf3;">sha256:12b8efe9a91d31349a6241b7d81c26d50bf913e369b5845a921be720e5de5796</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">Status:</span> <span style="color: #e6edf3;">Downloaded</span> <span style="color: #e6edf3;">newer</span> <span style="color: #e6edf3;">image</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">fcuvm/dev:v24</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">docker.io/fcuvm/dev:v24</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">Firecracker</span> <span style="color: #e6edf3;">devtool</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">Starting</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">release,</span> <span style="color: #e6edf3;">musl</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">...</span>
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">
</div></code></pre>
<p>There are few binaries generated:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">ls</span> <span style="color: #e6edf3;">build/cargo_target/aarch64-unknown-linux-musl/release/</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">firecracker,jailer</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="2"> <span style="color: #d2a8ff;">build/cargo_target/aarch64-unknown-linux-musl/release/firecracker*</span>
</div><div class="line" data-line="3"> <span style="color: #d2a8ff;">build/cargo_target/aarch64-unknown-linux-musl/release/jailer*</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">file</span> <span style="color: #e6edf3;">build/cargo_target/aarch64-unknown-linux-musl/release/</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">firecracker,jailer</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">build/cargo_target/aarch64-unknown-linux-musl/release/firecracker:</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">ELF</span> <span style="color: #e6edf3;">64-bit</span> <span style="color: #e6edf3;">LSB</span> <span style="color: #e6edf3;">executable,</span> <span style="color: #e6edf3;">ARM</span> <span style="color: #e6edf3;">aarch64,</span> <span style="color: #e6edf3;">version</span> <span style="color: #79c0ff;">1</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">GNU/Linux</span><span style="color: #e6edf3;">)</span>,
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">statically</span> <span style="color: #e6edf3;">linked,</span> <span style="color: #e6edf3;">BuildID</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">sha1</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">=4da58d970ac0c51aad276309866f2b701cc397cd,</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">debug_info,</span> <span style="color: #e6edf3;">not</span> <span style="color: #e6edf3;">stripped</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">build/cargo_target/aarch64-unknown-linux-musl/release/jailer:</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">ELF</span> <span style="color: #e6edf3;">64-bit</span> <span style="color: #e6edf3;">LSB</span> <span style="color: #e6edf3;">executable,</span> <span style="color: #e6edf3;">ARM</span> <span style="color: #e6edf3;">aarch64,</span> <span style="color: #e6edf3;">version</span> <span style="color: #79c0ff;">1</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">GNU/Linux</span><span style="color: #e6edf3;">)</span>,
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">statically</span> <span style="color: #e6edf3;">linked,</span> <span style="color: #e6edf3;">BuildID</span><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">sha1</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">=3e3c780b4e0fbd74b661c54f11192f9a15b89cba,</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">debug_info,</span> <span style="color: #e6edf3;">not</span> <span style="color: #e6edf3;">stripped</span>
</div></code></pre>
<p>Using these binaries we can create the VMs.</p>
<h2 id="creating-a-microvm"><a href="#creating-a-microvm">Creating a microVM</a></h2>
<p>Before getting started, there are multiple ways to start a microVM with Firecracker. Here are a few:</p>
<ul>
<li>starting up the Firecracker binary and through the Unix socket configure it and then start a VM</li>
<li>starting Firecracker with a complete VM config without the Unix socket API</li>
<li>starting Firecracker with Jailer so it uses cgroups to containerize the VM</li>
</ul>
<p>We are going to check out the first way.</p>
<p>When I started to fiddle with FC I was trying to use the official CLI (Firectl) and because it is written in Go you need to have a Go compiler if you would like to build it yourself. I did not like this option too much so I have created a new CLI called <a href="https://github.com/l1x/pattacu">Pattacu</a> written in Python.</p>
<h3 id="compiling-a-new-kernel"><a href="#compiling-a-new-kernel">Compiling a new kernel</a></h3>
<p>This is optional. You can download the official kernel from Firecracker:</p>
<p><a href="https://github.com/firecracker-microvm/firecracker/blob/master/docs/rootfs-and-kernel-setup.md">https://github.com/firecracker-microvm/firecracker/blob/master/docs/rootfs-and-kernel-setup.md</a></p>
<p>If you decided to compile a new Linux kernel there are few things you need to have.</p>
<ul>
<li>kernel-source</li>
<li>tools to compile</li>
</ul>
<p>I usually use one of the long term releases:</p>
<ul>
<li><a href="https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.14.212.tar.xz">https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.14.212.tar.xz</a></li>
<li><a href="https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.19.163.tar.xz">https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.19.163.tar.xz</a></li>
<li><a href="https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.4.83.tar.xz">https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.4.83.tar.xz</a></li>
</ul>
<p>After extracting the kernel source to a folder you can grab the config I have prepared with some help from an OpenWrt developer:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://raw.githubusercontent.com/l1x/pattacu/main/kernel-config/microvm-kernel-arm64.4.19.config</span> <span style="color: #e6edf3;">-O</span> <span style="color: #e6edf3;">.config</span>
</div></code></pre>
<p>There are more tools required for building a new kernel:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">bison</span> <span style="color: #e6edf3;">clang</span> <span style="color: #e6edf3;">make</span> <span style="color: #e6edf3;">flex</span> <span style="color: #e6edf3;">linux-headers</span> <span style="color: #e6edf3;">openssl-dev</span> <span style="color: #e6edf3;">perl</span>
</div></code></pre>
<p>With these the kernel can be compiled:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">make</span> <span style="color: #e6edf3;">olddefconfig</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">time</span> <span style="color: #e6edf3;">make</span> <span style="color: #e6edf3;">Image.gz</span>
</div></code></pre>
<p>This is going to take a while. After that, the kernel file we need for the microVM will be arch/arm64/boot/Image.</p>
<h3 id="creating-a-new-rootfs"><a href="#creating-a-new-rootfs">Creating a new rootfs</a></h3>
<p>This is optional. You can download the official rootfs from Firecracker:</p>
<p><a href="https://github.com/firecracker-microvm/firecracker/blob/master/docs/rootfs-and-kernel-setup.md">https://github.com/firecracker-microvm/firecracker/blob/master/docs/rootfs-and-kernel-setup.md</a></p>
<p>There is a project that can be used to create an Alpine rootfs. With a bit of additional shell scripting, we can create a customized rootfs that can boot up in Firecracker.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://raw.githubusercontent.com/alpinelinux/alpine-make-rootfs/v0.5.1/alpine-make-rootfs</span> <span style="color: #e6edf3;">-O</span> <span style="color: #e6edf3;">alpine-make-rootfs</span> \
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&amp;&amp;</span> <span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;a7159f17b01ad5a06419b83ea3ca9bbe7d3f8c03 alpine-make-rootfs&#39;</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">sha1sum</span> <span style="color: #e6edf3;">-c</span> \
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">||</span> <span style="color: #d2a8ff;">exit</span> <span style="color: #79c0ff;">1</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">chmod</span> <span style="color: #e6edf3;">+x</span> <span style="color: #e6edf3;">alpine-make-rootfs</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">./alpine-make-rootfs</span> \
</div><div class="line" data-line="6">  <span style="color: #e6edf3;">--branch</span> <span style="color: #e6edf3;">v3.12</span> \
</div><div class="line" data-line="7">  <span style="color: #e6edf3;">--packages</span> <span style="color: #a5d6ff;">&#39;openrc util-linux&#39;</span> \
</div><div class="line" data-line="8">  <span style="color: #e6edf3;">--timezone</span> <span style="color: #a5d6ff;">&#39;Europe/Budapest&#39;</span> \
</div><div class="line" data-line="9">  <span style="color: #e6edf3;">--script-chroot</span> \
</div><div class="line" data-line="10">    <span style="color: #e6edf3;">rootfs-</span><span style="color: #e6edf3;">$(</span><span style="color: #d2a8ff;">date</span> <span style="color: #e6edf3;">+%Y%m%d</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.tar.gz</span> - <span style="color: #79c0ff;">&lt;&lt;</span><span style="color: #7ee787;">&#39;SHELL&#39;</span>
</div><div class="line" data-line="11">    <span style="color: #a5d6ff;">ln -s agetty /etc/init.d/agetty.ttyS0</span>
</div><div class="line" data-line="12"><span style="color: #a5d6ff;">    echo ttyS0 &gt; /etc/securetty</span>
</div><div class="line" data-line="13"><span style="color: #a5d6ff;">    echo &#39;nameserver 1.1.1.1&#39; &gt; /etc/resolv.conf</span>
</div><div class="line" data-line="14"><span style="color: #a5d6ff;">    rc-update add agetty.ttyS0 default</span>
</div><div class="line" data-line="15"><span style="color: #a5d6ff;">    rc-update add devfs boot</span>
</div><div class="line" data-line="16"><span style="color: #a5d6ff;">    rc-update add procfs boot</span>
</div><div class="line" data-line="17"><span style="color: #a5d6ff;">    rc-update add sysfs boot</span>
</div><div class="line" data-line="18"><span style="color: #a5d6ff;"></span><span style="color: #7ee787;">SHELL</span>
</div><div class="line" data-line="19">
</div><div class="line" data-line="20"><span style="color: #d2a8ff;">dd</span> <span style="color: #e6edf3;">if=/dev/zero</span> <span style="color: #e6edf3;">of=alpine.ext4</span> <span style="color: #e6edf3;">bs=1</span> <span style="color: #e6edf3;">count=1</span> <span style="color: #e6edf3;">seek=256M</span>
</div><div class="line" data-line="21"><span style="color: #d2a8ff;">mkfs.ext4</span> <span style="color: #e6edf3;">alpine.ext4</span>
</div><div class="line" data-line="22"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">mkdir</span> <span style="color: #e6edf3;">/tmp/alpine-rootfs</span>
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">mount</span> <span style="color: #e6edf3;">alpine.ext4</span> <span style="color: #e6edf3;">/tmp/alpine-rootfs</span>
</div><div class="line" data-line="24"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">tar</span> <span style="color: #e6edf3;">xzvf</span> <span style="color: #e6edf3;">rootfs-</span><span style="color: #e6edf3;">$(</span><span style="color: #d2a8ff;">date</span> <span style="color: #e6edf3;">+%Y%m%d</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.tar.gz</span> <span style="color: #e6edf3;">-C</span> <span style="color: #e6edf3;">/tmp/alpine-rootfs</span>
</div><div class="line" data-line="25"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">umount</span> <span style="color: #e6edf3;">/tmp/alpine-rootfs</span>
</div></code></pre>
<h3 id="configuring-host-networking"><a href="#configuring-host-networking">Configuring host networking</a></h3>
<p>If you would like to use networking with Firecracker the host network has to be configured to support this.</p>
<p>First loading the kernel driver, installing iproute2 (the ip command):</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">modprobe</span> <span style="color: #e6edf3;">tun</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">iproute2</span> <span style="color: #e6edf3;">acl</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">tuntap</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">mode</span> <span style="color: #e6edf3;">tap</span>
</div></code></pre>
<p>Second, configuring networking and forwarding:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">addr</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">172.16.0.1/24</span> <span style="color: #e6edf3;">dev</span> <span style="color: #e6edf3;">tap0</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">link</span> <span style="color: #e6edf3;">set</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">up</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">sh</span> <span style="color: #e6edf3;">-c</span> <span style="color: #a5d6ff;">&quot;echo 1 &gt; /proc/sys/net/ipv4/ip_forward&quot;</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">nat</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">POSTROUTING</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">eth0</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">MASQUERADE</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">FORWARD</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">conntrack</span> <span style="color: #e6edf3;">--ctstate</span> <span style="color: #e6edf3;">RELATED,ESTABLISHED</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">ACCEPT</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">FORWARD</span> <span style="color: #e6edf3;">-i</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">eth0</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">ACCEPT</span>
</div></code></pre>
<h3 id="enablig-non-root-access"><a href="#enablig-non-root-access">Enablig non-root access</a></h3>
<p>I like to run Firecracker as a non-root user and it is easy to achieve:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">setfacl</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">u:</span><span style="color: #e6edf3;">$</span><span style="color: #79c0ff;">USER</span><span style="color: #e6edf3;">:rw</span> <span style="color: #e6edf3;">/dev/kvm</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">setcap</span> <span style="color: #e6edf3;">cap_net_bind_service=+ep</span> <span style="color: #e6edf3;">/usr/bin/socat</span>
</div></code></pre>
<p>This gives your user access to /dev/kvm and enabled socat bind to port 80 without root, using the new Linux kernel capabilities.</p>
<h3 id="booting-up-the-microvm-using-pattacu"><a href="#booting-up-the-microvm-using-pattacu">Booting up the microVM using Pattacu</a></h3>
<p>For running Pattacu the only dependency is Python3 (I have not tested it with Python2).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">apk</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">python3</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">cd</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">python3</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">venv</span> <span style="color: #e6edf3;">venv</span>
</div><div class="line" data-line="4"><span style="color: #8b949e;"># Depending on your shell</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">.</span> <span style="color: #e6edf3;">~/venv/bin/activate.fish</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">cd</span> <span style="color: #e6edf3;">/where/you/store/repos</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">git</span> <span style="color: #e6edf3;">clone</span> <span style="color: #e6edf3;">git@github.com:l1x/pattacu.git</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">cd</span> <span style="color: #e6edf3;">pattacu</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">pip</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">-r</span> <span style="color: #e6edf3;">requirements.txt</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">-h</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">-h</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">usage:</span> <span style="color: #e6edf3;">pattacu</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">-h</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">describe-instance,put-boot-source,put-drives,put-machine-config,put-network-interfaces,put-actions</span><span style="color: #e6edf3;">&rbrace;</span> <span style="color: #e6edf3;">...</span>
</div><div class="line" data-line="13">
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">positional</span> <span style="color: #e6edf3;">arguments:</span>
</div><div class="line" data-line="15">  <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">describe-instance,put-boot-source,put-drives,put-machine-config,put-network-interfaces,put-actions</span><span style="color: #a5d6ff;">&rbrace;</span>
</div><div class="line" data-line="16">
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">optional</span> <span style="color: #e6edf3;">arguments:</span>
</div><div class="line" data-line="18">  <span style="color: #d2a8ff;">-h,</span> <span style="color: #e6edf3;">--help</span>            <span style="color: #e6edf3;">show</span> <span style="color: #e6edf3;">this</span> <span style="color: #e6edf3;">help</span> <span style="color: #e6edf3;">message</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">exit</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:35:01</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">Quitting...</span>
</div><div class="line" data-line="20"><span style="color: #e6edf3;"></span>
</div></code></pre>
<p>For starting up a microVM there are few things to be configured:</p>
<ul>
<li>starting Firecracker</li>
<li>starting socat</li>
<li>configuring which kernel to boot up as the guest</li>
<li>configuring which rootfs to be used by the guest</li>
<li>configuring guest machine config</li>
<li>configuring guest networking</li>
</ul>
<p>In this order:</p>
<h4 id="starting-firecracker"><a href="#starting-firecracker">Starting Firecracker</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">export</span> <span style="color: #e6edf3;">socket_path</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">/data/fc/firecracker.socket</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">rm</span> <span style="color: #e6edf3;">-f</span> <span style="color: #a5d6ff;">&quot;<span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">socket_path</span>&quot;</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">./firecracker</span> <span style="color: #e6edf3;">--api-sock</span> <span style="color: #a5d6ff;">&quot;<span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">socket_path</span>&quot;</span> <span style="color: #e6edf3;">--level</span> <span style="color: #e6edf3;">Debug</span> <span style="color: #e6edf3;">--log-path</span> <span style="color: #e6edf3;">firecracker.log</span> <span style="color: #e6edf3;">--show-log-origin</span> <span style="color: #e6edf3;">--id</span> <span style="color: #e6edf3;">fc-test</span>
</div></code></pre>
<h4 id="starting-socat"><a href="#starting-socat">Starting socat</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">socat</span> <span style="color: #e6edf3;">-v</span> <span style="color: #e6edf3;">-v</span> <span style="color: #e6edf3;">TCP-LISTEN:80,reuseaddr,fork</span> <span style="color: #e6edf3;">UNIX-CLIENT:</span><span style="color: #a5d6ff;">&quot;<span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">socket_path</span>&quot;</span>
</div></code></pre>
<h4 id="configuring-which-kernel-to-boot-up-as-the-guest"><a href="#configuring-which-kernel-to-boot-up-as-the-guest">Configuring which kernel to boot up as the guest</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-boot-source</span> \
</div><div class="line" data-line="2">	<span style="color: #e6edf3;">--boot-args</span> <span style="color: #a5d6ff;">&quot;keep_bootcon console=ttyS0 reboot=k panic=1 pci=off ip=172.16.0.42::172.16.0.1:255.255.255.0::eth0:off&quot;</span> \
</div><div class="line" data-line="3">	<span style="color: #e6edf3;">--kernel-image-path</span> <span style="color: #e6edf3;">/linux/arm64/kernel/4.14.210.image</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:44:12</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">ARGS:</span> <span style="color: #e6edf3;">Namespace</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">boot_args</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;keep_bootcon console=ttyS0</span>
</div><div class="line" data-line="6"><span style="color: #a5d6ff;">reboot=k panic=1 pci=off ip=172.16.0.42::172.16.0.1:255.255.255.0::eth0:off&#39;</span><span style="color: #a5d6ff;">,</span> <span style="color: #e6edf3;">func</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;put-boot-source&#39;</span><span style="color: #a5d6ff;">,</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">initrd_path</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">None,</span> <span style="color: #e6edf3;">kernel_image_path</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;/linux/arm64/kernel/4.14.210.image&#39;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:44:12</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;boot_args&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;keep_bootcon console=ttyS0 reboot=k</span>
</div><div class="line" data-line="9"><span style="color: #a5d6ff;">panic=1 pci=off ip=172.16.0.42::172.16.0.1:255.255.255.0::eth0:off&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="10"><span style="color: #a5d6ff;">&quot;kernel_image_path&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;/linux/arm64/kernel/4.14.210.image&quot;</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:44:12</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Status:</span> <span style="color: #79c0ff;">204</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Reason:</span>  <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">body:</span> <span style="color: #a5d6ff;">&quot;&quot;</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:44:12</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">Quitting...</span>
</div></code></pre>
<h4 id="configuring-which-rootfs-to-be-used-by-the-guest"><a href="#configuring-which-rootfs-to-be-used-by-the-guest">Configuring which rootfs to be used by the guest</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-drives</span> \
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">--drive-id</span> <span style="color: #e6edf3;">rootfs</span> \
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">--path</span> <span style="color: #e6edf3;">/data/pattacu/rootfs/example-20201213.tar.gz</span> \
</div><div class="line" data-line="4">  <span style="color: #e6edf3;">--read-only</span> <span style="color: #e6edf3;">false</span> \
</div><div class="line" data-line="5">  <span style="color: #e6edf3;">--root-device</span> <span style="color: #e6edf3;">true</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:46:30</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">ARGS:</span> <span style="color: #e6edf3;">Namespace</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">drive_id</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;rootfs&#39;</span><span style="color: #a5d6ff;">,</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">func</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;put-drives&#39;</span><span style="color: #a5d6ff;">,</span> <span style="color: #e6edf3;">path</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;/data/pattacu/rootfs/example-20201213.tar.gz&#39;</span><span style="color: #a5d6ff;">,</span> <span style="color: #e6edf3;">read_only</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">False,</span> <span style="color: #e6edf3;">root_device</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">True</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:46:30</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;drive_id&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;rootfs&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;path_on_host&quot;</span><span style="color: #e6edf3;">:</span>
</div><div class="line" data-line="9"><span style="color: #a5d6ff;">&quot;/data/pattacu/rootfs/example-20201213.tar.gz&quot;</span><span style="color: #a5d6ff;">,</span> <span style="color: #a5d6ff;">&quot;is_root_device&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">true,</span> <span style="color: #a5d6ff;">&quot;is_read_only&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">false</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:46:30</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Status:</span> <span style="color: #79c0ff;">204</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Reason:</span>  <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">body:</span> <span style="color: #a5d6ff;">&quot;&quot;</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:46:30</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">Quitting...</span>
</div></code></pre>
<h4 id="configuring-guest-machine-config"><a href="#configuring-guest-machine-config">Configuring guest machine config</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-machine-config</span> <span style="color: #e6edf3;">--mem-size-mib</span> <span style="color: #79c0ff;">128</span> <span style="color: #e6edf3;">--vcpu-count</span> <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">--ht-enabled</span> <span style="color: #e6edf3;">false</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:47:58</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">ARGS:</span> <span style="color: #e6edf3;">Namespace</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">cpu_template</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">None,</span> <span style="color: #e6edf3;">func</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;put-machine-config&#39;</span><span style="color: #a5d6ff;">,</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">ht_enabled</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">False,</span> <span style="color: #e6edf3;">mem_size_mib</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">128,</span> <span style="color: #e6edf3;">track_dirty_pages</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">None,</span> <span style="color: #e6edf3;">vcpu_count</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">2</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:47:58</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;vcpu_count&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">2,</span> <span style="color: #a5d6ff;">&quot;mem_size_mib&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">128,</span> <span style="color: #a5d6ff;">&quot;ht_enabled&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">false</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:47:58</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Status:</span> <span style="color: #79c0ff;">204</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Reason:</span>  <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">body:</span> <span style="color: #a5d6ff;">&quot;&quot;</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:47:58</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">Quitting...</span>
</div></code></pre>
<h4 id="configuring-guest-networking"><a href="#configuring-guest-networking">Configuring guest networking</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-network-interfaces</span> <span style="color: #e6edf3;">--iface-id</span> <span style="color: #e6edf3;">eth0</span> <span style="color: #e6edf3;">--guest-mac</span> <span style="color: #a5d6ff;">&quot;AA:FC:00:00:00:01&quot;</span> <span style="color: #e6edf3;">--host-dev-name</span> <span style="color: #e6edf3;">tap0</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:48:57</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">ARGS:</span> <span style="color: #e6edf3;">Namespace</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">func</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;put-network-interfaces&#39;</span><span style="color: #a5d6ff;">,</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">guest_mac</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;AA:FC:00:00:00:01&#39;</span><span style="color: #a5d6ff;">,</span> <span style="color: #e6edf3;">host_dev_name</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;tap0&#39;</span><span style="color: #a5d6ff;">,</span> <span style="color: #e6edf3;">iface_id</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;eth0&#39;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:48:57</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;iface_id&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;eth0&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5"><span style="color: #a5d6ff;">&quot;guest_mac&quot;</span><span style="color: #a5d6ff;">:</span> <span style="color: #a5d6ff;">&quot;AA:FC:00:00:00:01&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;host_dev_name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;tap0&quot;</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:48:58</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Status:</span> <span style="color: #79c0ff;">204</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Reason:</span>  <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">body:</span> <span style="color: #a5d6ff;">&quot;&quot;</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:48:58</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">Quitting...</span>
</div></code></pre>
<h4 id="starting-up-the-instance"><a href="#starting-up-the-instance">Starting up the instance</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-actions</span> <span style="color: #e6edf3;">--action-type</span> <span style="color: #e6edf3;">InstanceStart</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:49:19</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">ARGS:</span> <span style="color: #e6edf3;">Namespace</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">action_type</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;InstanceStart&#39;</span><span style="color: #a5d6ff;">,</span> <span style="color: #e6edf3;">func</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;put-actions&#39;</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:49:19</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">&quot;action_type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;InstanceStart&quot;</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:49:20</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Status:</span> <span style="color: #79c0ff;">204</span> <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">Reason:</span>  <span style="color: #e6edf3;">HTTP</span> <span style="color: #e6edf3;">body:</span> <span style="color: #a5d6ff;">&quot;&quot;</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">2020-12-13</span> <span style="color: #e6edf3;">20:49:20</span> <span style="color: #e6edf3;">INFO</span> <span style="color: #e6edf3;">Quitting...</span>
</div></code></pre>
<p>You can switch to the other tmux window and see the system booting up.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span>    1.739745<span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">random:</span> <span style="color: #e6edf3;">fast</span> <span style="color: #e6edf3;">init</span> <span style="color: #e6edf3;">done</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">27/1844</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">ok</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="2"> <span style="color: #d2a8ff;">*</span> <span style="color: #e6edf3;">Mounting</span> <span style="color: #e6edf3;">/sys</span> <span style="color: #e6edf3;">...</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">ok</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="3"> <span style="color: #d2a8ff;">*</span> <span style="color: #e6edf3;">Mounting</span> <span style="color: #e6edf3;">security</span> <span style="color: #e6edf3;">filesystem</span> <span style="color: #e6edf3;">...</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">ok</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="4"> <span style="color: #d2a8ff;">*</span> <span style="color: #e6edf3;">Mounting</span> <span style="color: #e6edf3;">debug</span> <span style="color: #e6edf3;">filesystem</span> <span style="color: #e6edf3;">...</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">ok</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="5"> <span style="color: #d2a8ff;">*</span> <span style="color: #e6edf3;">Mounting</span> <span style="color: #e6edf3;">SELinux</span> <span style="color: #e6edf3;">filesystem</span> <span style="color: #e6edf3;">...</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">ok</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="6"> <span style="color: #d2a8ff;">*</span> <span style="color: #e6edf3;">Mounting</span> <span style="color: #e6edf3;">persistent</span> <span style="color: #e6edf3;">storage</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">pstore</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">filesystem</span> <span style="color: #e6edf3;">...</span> <span style="color: #e6edf3;">[</span> <span style="color: #e6edf3;">ok</span> <span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="7">
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Welcome</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">Alpine</span> <span style="color: #e6edf3;">Linux</span> <span style="color: #e6edf3;">3.12</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Kernel</span> <span style="color: #e6edf3;">4.20.0</span> <span style="color: #e6edf3;">on</span> <span style="color: #e6edf3;">an</span> <span style="color: #e6edf3;">aarch64</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">ttyS0</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="10">
</div><div class="line" data-line="11"><span style="color: #79c0ff;">172</span> <span style="color: #e6edf3;">login:</span> <span style="color: #e6edf3;">root</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">Welcome</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">Alpine!</span>
</div><div class="line" data-line="13">
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">The</span> <span style="color: #e6edf3;">Alpine</span> <span style="color: #e6edf3;">Wiki</span> <span style="color: #e6edf3;">contains</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">large</span> <span style="color: #e6edf3;">number</span> <span style="color: #e6edf3;">of</span> <span style="color: #e6edf3;">how-to</span> <span style="color: #e6edf3;">guides</span> <span style="color: #e6edf3;">and</span> <span style="color: #e6edf3;">general</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">information</span> <span style="color: #e6edf3;">about</span> <span style="color: #e6edf3;">administrating</span> <span style="color: #e6edf3;">Alpine</span> <span style="color: #e6edf3;">systems.</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">See</span> <span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">http://wiki.alpinelinux.org/</span><span style="color: #79c0ff;">&gt;</span><span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="17">
</div><div class="line" data-line="18"><span style="color: #d2a8ff;">You</span> <span style="color: #e6edf3;">can</span> <span style="color: #e6edf3;">set</span> <span style="color: #e6edf3;">up</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">system</span> <span style="color: #e6edf3;">with</span> <span style="color: #e6edf3;">the</span> <span style="color: #e6edf3;">command:</span> <span style="color: #e6edf3;">setup-alpine</span>
</div><div class="line" data-line="19">
</div><div class="line" data-line="20"><span style="color: #d2a8ff;">You</span> <span style="color: #e6edf3;">may</span> <span style="color: #e6edf3;">change</span> <span style="color: #e6edf3;">this</span> <span style="color: #e6edf3;">message</span> <span style="color: #e6edf3;">by</span> <span style="color: #e6edf3;">editing</span> <span style="color: #e6edf3;">/etc/motd.</span>
</div><div class="line" data-line="21">
</div><div class="line" data-line="22"><span style="color: #e6edf3;">login</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">840</span><span style="color: #e6edf3;">]</span>: <span style="color: #d2a8ff;">root</span> <span style="color: #e6edf3;">login</span> <span style="color: #e6edf3;">on</span> <span style="color: #a5d6ff;">&#39;ttyS0&#39;</span>
</div><div class="line" data-line="23"><span style="color: #d2a8ff;">172:~#</span> <span style="color: #e6edf3;">ping</span> <span style="color: #e6edf3;">hackernews.org</span>
</div><div class="line" data-line="24"><span style="color: #d2a8ff;">PING</span> <span style="color: #e6edf3;">hackernews.org</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">162.255.119.249</span><span style="color: #e6edf3;">)</span>: <span style="color: #79c0ff;">56</span> <span style="color: #e6edf3;">data</span> <span style="color: #e6edf3;">bytes</span>
</div><div class="line" data-line="25"><span style="color: #79c0ff;">64</span> <span style="color: #e6edf3;">bytes</span> <span style="color: #e6edf3;">from</span> <span style="color: #e6edf3;">162.255.119.249:</span> <span style="color: #e6edf3;">seq=0</span> <span style="color: #e6edf3;">ttl=42</span> <span style="color: #e6edf3;">time=180.868</span> <span style="color: #e6edf3;">ms</span>
</div></code></pre>
<h2 id="closing"><a href="#closing">Closing</a></h2>
<p>I think Firecracker has a great potential to be the next platform for containerization especially because of its lean nature. If we could create a reasonable service that hosts FC images that are easy to deploy it could replace Docker easily. I hope it takes off.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">172:~#</span> <span style="color: #e6edf3;">poweroff</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">The</span> <span style="color: #e6edf3;">system</span> <span style="color: #e6edf3;">is</span> <span style="color: #e6edf3;">going</span> <span style="color: #e6edf3;">down</span> <span style="color: #e6edf3;">NOW!</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Sent</span> <span style="color: #e6edf3;">SIGTERM</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">all</span> <span style="color: #e6edf3;">processes</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Sent</span> <span style="color: #e6edf3;">SIGKILL</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">all</span> <span style="color: #e6edf3;">processes</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Requesting</span> <span style="color: #e6edf3;">system</span> <span style="color: #e6edf3;">poweroff</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">[</span>  202.707132<span style="color: #e6edf3;">]</span> <span style="color: #d2a8ff;">reboot:</span> <span style="color: #e6edf3;">Power</span> <span style="color: #e6edf3;">down</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">[</span>  202.707132<span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span> <span style="color: #d2a8ff;">reboot:</span> <span style="color: #e6edf3;">Power</span> <span style="color: #e6edf3;">down</span>
</div></code></pre>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 13 Dec 2020 23:02:21 +0100</pubDate>
    </item>
    <item>
      <title>Pattacu</title>
      <link>https://dev.l1x.be/projects/pattacu/</link>
      <guid isPermaLink="true">https://dev.l1x.be/projects/pattacu/</guid>
      <content:encoded><![CDATA[<h1 id="pattacu"><a href="#pattacu">Pattacu</a></h1>
<p>CLI for AWS Firecracker</p>
<p><a href="https://tamildictionary.org/tamil_english.php?id=13396">பட்டாசு</a></p>
<h2 id="why"><a href="#why">Why</a></h2>
<p>I wanted to learn more about Firecracker and I could not compile Firectl on ARM easily so I started to write a CLI in Python that is much easier to use on any CPU architecture.</p>
<h2 id="how"><a href="#how">How</a></h2>
<p>Most of the functionality is implemented as a simpple Python function. It is a single file using few imports. The HTTP calls are implemented using http.client and urllib.parse.</p>
<h2 id="status"><a href="#status">Status</a></h2>
<p>This is the current status in alphabetical order. All API methods are implemented that are required to start up a guest instance with networking.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">[ ] createSnapshot
</div><div class="line" data-line="2">[x] createSyncAction
</div><div class="line" data-line="3">[ ] describeBalloonConfig
</div><div class="line" data-line="4">[ ] describeBalloonStats
</div><div class="line" data-line="5">[x] describeInstance
</div><div class="line" data-line="6">[ ] getMachineConfiguration
</div><div class="line" data-line="7">[ ] loadSnapshot
</div><div class="line" data-line="8">[ ] mmdsConfigPut
</div><div class="line" data-line="9">[ ] mmdsGet
</div><div class="line" data-line="10">[ ] mmdsPatch
</div><div class="line" data-line="11">[ ] mmdsPut
</div><div class="line" data-line="12">[ ] patchBalloon
</div><div class="line" data-line="13">[ ] patchBalloonStatsInterval
</div><div class="line" data-line="14">[ ] patchGuestDriveByID
</div><div class="line" data-line="15">[ ] patchGuestNetworkInterfaceByID
</div><div class="line" data-line="16">[x] patchMachineConfiguration
</div><div class="line" data-line="17">[ ] patchVm
</div><div class="line" data-line="18">[ ] putBalloon
</div><div class="line" data-line="19">[x] putGuestBootSource
</div><div class="line" data-line="20">[x] putGuestDriveByID
</div><div class="line" data-line="21">[x] putGuestNetworkInterfaceByID
</div><div class="line" data-line="22">[ ] putGuestVsock
</div><div class="line" data-line="23">[ ] putLogger
</div><div class="line" data-line="24">[ ] putMachineConfiguration
</div><div class="line" data-line="25">[ ] putMetrics
</div></code></pre>
<h2 id="the-complete-workflow"><a href="#the-complete-workflow">The complete workflow</a></h2>
<p>This is the complete workflow:</p>
<h3 id="giving-access"><a href="#giving-access">Giving access</a></h3>
<p>We need to access /dev/kvm with our user and be able to bind on port 80 as well (using socat). Socat is required because Firecracker uses a unix socket and I wanted to just being able to use a normal network socket instead.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">setfacl</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">u:</span><span style="color: #e6edf3;">$</span><span style="color: #79c0ff;">USER</span><span style="color: #e6edf3;">:rw</span> <span style="color: #e6edf3;">/dev/kvm</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">setcap</span> <span style="color: #e6edf3;">cap_net_bind_service=+ep</span> <span style="color: #e6edf3;">/usr/bin/socat</span>
</div></code></pre>
<h3 id="starting-up-socat-and-firecracker"><a href="#starting-up-socat-and-firecracker">Starting up socat and Firecracker</a></h3>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">socat</span> <span style="color: #e6edf3;">-v</span> <span style="color: #e6edf3;">-v</span> <span style="color: #e6edf3;">TCP-LISTEN:80,reuseaddr,fork</span> <span style="color: #e6edf3;">UNIX-CLIENT:/data/fc/firecracker.socket</span> <span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">rm</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">/data/fc/firecracker.socket</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">./firecracker</span> <span style="color: #e6edf3;">--api-sock</span> <span style="color: #e6edf3;">/data/fc/firecracker.socket</span> <span style="color: #e6edf3;">--level</span> <span style="color: #e6edf3;">Debug</span> <span style="color: #e6edf3;">--log-path</span> <span style="color: #e6edf3;">/data/fc/firecracker.log</span> <span style="color: #e6edf3;">--show-log-origin</span> <span style="color: #e6edf3;">--id</span> <span style="color: #e6edf3;">fc-test</span>
</div></code></pre>
<h3 id="configuring-host-networking"><a href="#configuring-host-networking">Configuring host networking</a></h3>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">modprobe</span> <span style="color: #e6edf3;">tun</span>
</div><div class="line" data-line="2"><span style="color: #8b949e;"># sudo apk add iproute2 or similar for the ip command</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">tuntap</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">mode</span> <span style="color: #e6edf3;">tap</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">addr</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">172.16.0.1/24</span> <span style="color: #e6edf3;">dev</span> <span style="color: #e6edf3;">tap0</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">link</span> <span style="color: #e6edf3;">set</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">up</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">sh</span> <span style="color: #e6edf3;">-c</span> <span style="color: #a5d6ff;">&quot;echo 1 &gt; /proc/sys/net/ipv4/ip_forward&quot;</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">nat</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">POSTROUTING</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">eth0</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">MASQUERADE</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">FORWARD</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">conntrack</span> <span style="color: #e6edf3;">--ctstate</span> <span style="color: #e6edf3;">RELATED,ESTABLISHED</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">ACCEPT</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">FORWARD</span> <span style="color: #e6edf3;">-i</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">eth0</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">ACCEPT</span>
</div></code></pre>
<h3 id="starting-the-guest-vm"><a href="#starting-the-guest-vm">Starting the guest VM</a></h3>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-boot-source</span> \
</div><div class="line" data-line="2">	<span style="color: #e6edf3;">--boot-args</span> <span style="color: #a5d6ff;">&quot;keep_bootcon console=ttyS0 reboot=k panic=1 pci=off ip=172.16.0.42::172.16.0.1:255.255.255.0::eth0:off&quot;</span> \
</div><div class="line" data-line="3">	<span style="color: #e6edf3;">--kernel-image-path</span> <span style="color: #e6edf3;">/data/fc/arm64/kernel/4.14.210.image</span>
</div><div class="line" data-line="4">
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-drives</span> \
</div><div class="line" data-line="6">	<span style="color: #e6edf3;">--drive-id</span> <span style="color: #e6edf3;">rootfs</span> \
</div><div class="line" data-line="7">	<span style="color: #e6edf3;">--path</span> <span style="color: #e6edf3;">/data/fc/arm64//rootfs/alpine.rootfs.ext4</span> \
</div><div class="line" data-line="8">	<span style="color: #e6edf3;">--read-only</span> <span style="color: #e6edf3;">false</span> \
</div><div class="line" data-line="9">	<span style="color: #e6edf3;">--root-device</span> <span style="color: #e6edf3;">true</span>
</div><div class="line" data-line="10">
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-machine-config</span> <span style="color: #e6edf3;">--mem-size-mib</span> <span style="color: #79c0ff;">128</span> <span style="color: #e6edf3;">--vcpu-count</span> <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">--ht-enabled</span> <span style="color: #e6edf3;">false</span>
</div><div class="line" data-line="12">
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-network-interfaces</span> <span style="color: #e6edf3;">--iface-id</span> <span style="color: #e6edf3;">eth0</span> <span style="color: #e6edf3;">--guest-mac</span> <span style="color: #a5d6ff;">&quot;AA:FC:00:00:00:01&quot;</span> <span style="color: #e6edf3;">--host-dev-name</span> <span style="color: #e6edf3;">tap0</span>
</div><div class="line" data-line="14">
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">./bin/pattacu</span> <span style="color: #e6edf3;">put-actions</span> <span style="color: #e6edf3;">--action-type</span> <span style="color: #e6edf3;">InstanceStart</span>
</div></code></pre>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Thu, 03 Dec 2020 09:31:21 +0200</pubDate>
    </item>
    <item>
      <title>Getting started with Firecracker on Raspberry Pi</title>
      <link>https://dev.l1x.be/posts/2020/11/22/getting-started-with-firecracker-on-raspberry-pi/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/11/22/getting-started-with-firecracker-on-raspberry-pi/</guid>
      <content:encoded><![CDATA[<h2 id="article-series"><a href="#article-series">Article Series</a></h2>
<ul>
<li>1st part :: this</li>
<li>2nd part :: <a href="https://dev.l1x.be/posts/2020/12/13/diving-into-firecracker-with-alpine/">https://dev.l1x.be/posts/2020/12/13/diving-into-firecracker-with-alpine/</a></li>
</ul>
<h2 id="abstract"><a href="#abstract">Abstract</a></h2>
<p>Traditionally services were deployed on bare metal and in the last decades we have seen the rise of virtualisation (running additional operating systems in a operating system process) and lately containerisation (running an operating system process in a separate security context from the rest of processes on the same host). Virtualisation and containerisation offers different levels of isolation by moving some operating system functionality to the guest systems.</p>
<p>The following chart illustrates that pretty well:</p>
<p><img src="https://dev.l1x.be/img/isolation.png" alt="OS functionality location" /></p>
<p>Source: <a href="https://research.cs.wisc.edu/multifacet/papers/vee20_blending.pdf">https://research.cs.wisc.edu/multifacet/papers/vee20_blending.pdf</a></p>
<p>In this article, I perform a deep dive into Firecracker and how it can be used for deploying services on Raspberry Pi (4B).</p>
<h2 id="getting-started"><a href="#getting-started">Getting started</a></h2>
<p>There are few paths to take here. First I am going to try the easy one, using Ubuntu. Later on we can investigate the use of Alpine Linux which is much more lightweight than Ubuntu, ideal for devices like RPI.</p>
<h3 id="installing-the-image-on-a-microsd-card"><a href="#installing-the-image-on-a-microsd-card">Installing the image on a microSD card</a></h3>
<p>We need a 64 bit Ubuntu image and a microsd card. For the imaging I use <a href="https://www.balena.io/etcher/">Balena Etcher</a> that makes the imaging process super easy.</p>
<p>Getting the pre-installed image:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://cdimage.ubuntu.com/releases/20.04/release/</span>\
</div><div class="line" data-line="2"><span style="color: #e6edf3;">ubuntu-20.04.1-preinstalled-server-arm64+raspi.img.xz</span>
</div></code></pre>
<p>Preinstalled means that we get a fully working operating system and there is no need for additional installation steps after booting up. With Balena Etcher it is super easy to write the compressed image file to the sd card and boot the system up once ready. SSHD starts up after the installation and we can log in via ssh if we know the IP address that the DHCP server issues to our device (assuming DHCP server is present in our LAN).</p>
<p>There are few mildly annoying things with Ubuntu (snaps, unattended-upgrades) that I usually remove. I also prefer to use Chrony over the systemd equivalent. Ansible repo for these is available here: <a href="https://github.com/l1x/rpi/blob/main/ubuntu.20/ansible/roles/os/tasks/main.yml">https://github.com/l1x/rpi/blob/main/ubuntu.20/ansible/roles/os/tasks/main.yml</a></p>
<h3 id="installing-firecracker-jailer-and-firectl"><a href="#installing-firecracker-jailer-and-firectl">Installing Firecracker, Jailer and Firectl</a></h3>
<ul>
<li>Firecracker: The main component, it is a virtual machine monitor (VMM) that uses the Linux Kernel Virtual Machine (KVM) to create and run microVMs.</li>
<li>Jailer: For starting Firecracker in production mode, applies a cgroup/namespace isolation barrier and then drops privileges. There</li>
<li>Firectl: A command line utility for convenience</li>
</ul>
<h4 id="getting-firecracker-and-jailer"><a href="#getting-firecracker-and-jailer">Getting Firecracker and Jailer</a></h4>
<p>For the first two it is possible to download the release binaries from Github.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">version</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">&#39;v0.23.0&#39;</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://github.com/firecracker-microvm/firecracker/</span>\
</div><div class="line" data-line="4"><span style="color: #e6edf3;">releases/download/</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">/firecracker-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">-aarch64</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://github.com/firecracker-microvm/firecracker/</span>\
</div><div class="line" data-line="6"><span style="color: #e6edf3;">releases/download/</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">/jailer-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">-aarch64</span>
</div><div class="line" data-line="7">
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">mv</span> <span style="color: #e6edf3;">firecracker-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">-aarch64</span> <span style="color: #e6edf3;">firecracker</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">mv</span> <span style="color: #e6edf3;">jailer-</span><span style="color: #e6edf3;">$&lbrace;</span><span style="color: #e6edf3;">version</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">-aarch64</span> <span style="color: #e6edf3;">jailer</span>
</div><div class="line" data-line="10">
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">chmod</span> <span style="color: #e6edf3;">+x</span> <span style="color: #e6edf3;">firecracker</span> <span style="color: #e6edf3;">jailer</span>
</div><div class="line" data-line="12">
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">./firecracker</span> <span style="color: #e6edf3;">--help</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">./jailer</span> <span style="color: #e6edf3;">--help</span>
</div></code></pre>
<h4 id="firectl"><a href="#firectl">Firectl</a></h4>
<p>Firectl is a bit trickier to install because there is no release binary and it requires Golang 1.14 to compile. We can do these in two steps.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://golang.org/dl/go1.14.12.linux-arm64.tar.gz</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">tar</span> <span style="color: #e6edf3;">xzvf</span> <span style="color: #e6edf3;">go1.14.12.linux-arm64.tar.gz</span>
</div></code></pre>
<p>After getting go we can get the source of firectl and compile it:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">git</span> <span style="color: #e6edf3;">clone</span> <span style="color: #e6edf3;">https://github.com/firecracker-microvm/firectl.git</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">cd</span> <span style="color: #e6edf3;">firectl/</span>
</div><div class="line" data-line="3"> <span style="color: #d2a8ff;">~/go/bin/go</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">-x</span>
</div></code></pre>
<p>Testing Firectl:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./firectl</span> <span style="color: #e6edf3;">--help</span>
</div></code></pre>
<p>We have all the tools we need for running our first microVM the only thing is missing: something to run.</p>
<h3 id="downloading-our-first-image"><a href="#downloading-our-first-image">Downloading our first image</a></h3>
<p>For a microVM there are two things necessary to have:</p>
<ul>
<li>an uncompressed linux kernel (vmlinux)</li>
<li>a filesystem</li>
</ul>
<p>Later on we are going to investigate how we could create our own version of these, but for now we are going to use images from</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://s3.amazonaws.com/spec.ccfc.min/</span>\
</div><div class="line" data-line="2"><span style="color: #e6edf3;">img/aarch64/ubuntu_with_ssh/kernel/vmlinux.bin</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">wget</span> <span style="color: #e6edf3;">https://s3.amazonaws.com/spec.ccfc.min/</span>\
</div><div class="line" data-line="4"><span style="color: #e6edf3;">img/aarch64/ubuntu_with_ssh/fsfiles/xenial.rootfs.ext4</span>
</div></code></pre>
<h3 id="configuring-network"><a href="#configuring-network">Configuring network</a></h3>
<p>For the microVM to function properly we need a networking device. For this scenario we are going to use tap and create a device:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">tuntap</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">dev</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">mode</span> <span style="color: #e6edf3;">tap</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">addr</span> <span style="color: #e6edf3;">add</span> <span style="color: #e6edf3;">172.16.0.1/24</span> <span style="color: #e6edf3;">dev</span> <span style="color: #e6edf3;">tap0</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">ip</span> <span style="color: #e6edf3;">link</span> <span style="color: #e6edf3;">set</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">up</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">ip</span> <span style="color: #e6edf3;">addr</span> <span style="color: #e6edf3;">show</span> <span style="color: #e6edf3;">dev</span> <span style="color: #e6edf3;">tap0</span>
</div></code></pre>
<p>If we want to give access to our VM we have to enable IP forwarding:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #79c0ff;">DEVICE_NAME</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">eth0</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">sh</span> <span style="color: #e6edf3;">-c</span> <span style="color: #a5d6ff;">&quot;echo 1 &gt; /proc/sys/net/ipv4/ip_forward&quot;</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-t</span> <span style="color: #e6edf3;">nat</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">POSTROUTING</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">$</span><span style="color: #79c0ff;">DEVICE_NAME</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">MASQUERADE</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">FORWARD</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">conntrack</span> <span style="color: #e6edf3;">--ctstate</span> <span style="color: #e6edf3;">RELATED,ESTABLISHED</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">ACCEPT</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">iptables</span> <span style="color: #e6edf3;">-A</span> <span style="color: #e6edf3;">FORWARD</span> <span style="color: #e6edf3;">-i</span> <span style="color: #e6edf3;">tap0</span> <span style="color: #e6edf3;">-o</span> <span style="color: #e6edf3;">$</span><span style="color: #79c0ff;">DEVICE_NAME</span> <span style="color: #e6edf3;">-j</span> <span style="color: #e6edf3;">ACCEPT</span>
</div></code></pre>
<h3 id="running-our-first-microvm"><a href="#running-our-first-microvm">Running our first microVM</a></h3>
<p>This is how we can start up our first microVM. I usually start it in screen so I can open a new session easily because it will use the standard input and output for the newly started of console (unless you redirect it).</p>
<p>This is for debug mode, starting with sudo:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">./firectl/firectl</span> \
</div><div class="line" data-line="2"><span style="color: #e6edf3;">--firecracker-binary=./firecracker</span> \
</div><div class="line" data-line="3"><span style="color: #e6edf3;">--kernel=vmlinux.bin</span> \
</div><div class="line" data-line="4"><span style="color: #e6edf3;">--tap-device=tap0/aa:fc:00:00:00:01</span> \
</div><div class="line" data-line="5"><span style="color: #e6edf3;">--kernel-opts=</span>\
</div><div class="line" data-line="6"><span style="color: #a5d6ff;">&quot;console=ttyS0 reboot=k panic=1 pci=off \</span>
</div><div class="line" data-line="7"><span style="color: #a5d6ff;">ip=172.16.0.42::172.16.0.1:255.255.255.0::eth0:off&quot;</span> \
</div><div class="line" data-line="8"><span style="color: #e6edf3;">--root-drive=./xenial.rootfs.ext4</span>
</div></code></pre>
<p>If everything went well you can see something like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Ubuntu 18.04.2 LTS fadfdd4af58a ttyS0
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">fadfdd4af58a login:
</div></code></pre>
<p>User and password is root:root.</p>
<h3 id="testing-networking"><a href="#testing-networking">Testing networking</a></h3>
<p>For this we need to have a bit bigger image.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">dd</span> <span style="color: #e6edf3;">if=/dev/zero</span> <span style="color: #e6edf3;">bs=1M</span> <span style="color: #e6edf3;">count=800</span> <span style="color: #79c0ff;">&gt;&gt;</span> <span style="color: #e6edf3;">xenial.rootfs.ext4</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">resize2fs</span> <span style="color: #e6edf3;">-f</span> <span style="color: #e6edf3;">xenial.rootfs.ext4</span>
</div></code></pre>
<p>After starting up the usual way and logging in we need to fix few things:</p>
<p>Adding some working nameserver:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">echo</span> <span style="color: #a5d6ff;">&#39;nameserver 1.1.1.1&#39;</span> <span style="color: #79c0ff;">&gt;</span>  <span style="color: #e6edf3;">/etc/resolv.conf</span>
</div></code></pre>
<p>Now trying to update:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">root</span><span style="color: #d2a8ff;">@fadfdd4af58a:~#</span> <span style="color: #e6edf3;">apt</span> <span style="color: #e6edf3;">update</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Get:1</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic</span> <span style="color: #e6edf3;">InRelease</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">242</span> <span style="color: #e6edf3;">kB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">Get:2</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-updates</span> <span style="color: #e6edf3;">InRelease</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">88.7</span> <span style="color: #e6edf3;">kB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Hit:3</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-backports</span> <span style="color: #e6edf3;">InRelease</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">Hit:4</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-security</span> <span style="color: #e6edf3;">InRelease</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">Get:5</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic/universe</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">11.0</span> <span style="color: #e6edf3;">MB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">Get:6</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic/multiverse</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">153</span> <span style="color: #e6edf3;">kB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">Get:7</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic/main</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">1285</span> <span style="color: #e6edf3;">kB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Get:8</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic/restricted</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">572</span> <span style="color: #e6edf3;">B</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">Get:9</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-updates/universe</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">1865</span> <span style="color: #e6edf3;">kB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="11"><span style="color: #d2a8ff;">Get:10</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-updates/restricted</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">2262</span> <span style="color: #e6edf3;">B</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">Get:11</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-updates/main</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">1431</span> <span style="color: #e6edf3;">kB</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="13"><span style="color: #d2a8ff;">Get:12</span> <span style="color: #e6edf3;">http://ports.ubuntu.com/ubuntu-ports</span> <span style="color: #e6edf3;">bionic-updates/multiverse</span> <span style="color: #e6edf3;">arm64</span> <span style="color: #e6edf3;">Packages</span> <span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">5758</span> <span style="color: #e6edf3;">B</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">Fetched</span> <span style="color: #e6edf3;">16.1</span> <span style="color: #e6edf3;">MB</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">6s</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #79c0ff;">2543</span> <span style="color: #e6edf3;">kB/s</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="15"><span style="color: #d2a8ff;">Reading</span> <span style="color: #e6edf3;">package</span> <span style="color: #e6edf3;">lists...</span> <span style="color: #e6edf3;">Error!</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">E:</span> <span style="color: #e6edf3;">flAbsPath</span> <span style="color: #e6edf3;">on</span> <span style="color: #e6edf3;">/var/lib/dpkg/status</span> <span style="color: #e6edf3;">failed</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">realpath</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">2:</span> <span style="color: #e6edf3;">No</span> <span style="color: #e6edf3;">such</span> <span style="color: #e6edf3;">file</span> <span style="color: #e6edf3;">or</span> <span style="color: #e6edf3;">directory</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="17"><span style="color: #d2a8ff;">E:</span> <span style="color: #e6edf3;">Could</span> <span style="color: #e6edf3;">not</span> <span style="color: #e6edf3;">open</span> <span style="color: #e6edf3;">file</span>  <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">open</span><span style="color: #e6edf3;"></span> <span style="color: #e6edf3;">(</span><span style="color: #d2a8ff;">2:</span> <span style="color: #e6edf3;">No</span> <span style="color: #e6edf3;">such</span> <span style="color: #e6edf3;">file</span> <span style="color: #e6edf3;">or</span> <span style="color: #e6edf3;">directory</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="18"><span style="color: #d2a8ff;">E:</span> <span style="color: #e6edf3;">Problem</span> <span style="color: #e6edf3;">opening</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">E:</span> <span style="color: #e6edf3;">The</span> <span style="color: #e6edf3;">package</span> <span style="color: #e6edf3;">lists</span> <span style="color: #e6edf3;">or</span> <span style="color: #e6edf3;">status</span> <span style="color: #e6edf3;">file</span> <span style="color: #e6edf3;">could</span> <span style="color: #e6edf3;">not</span> <span style="color: #e6edf3;">be</span> <span style="color: #e6edf3;">parsed</span> <span style="color: #e6edf3;">or</span> <span style="color: #e6edf3;">opened.</span>
</div></code></pre>
<p>Fixing the apt issues:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">mkdir</span> <span style="color: #e6edf3;">-p</span> <span style="color: #e6edf3;">/var/lib/dpkg/</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">info,alternatives</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">touch</span> <span style="color: #e6edf3;">/var/lib/dpkg/status</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">apt</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">apt-utils</span> <span style="color: #e6edf3;">-y</span>
</div></code></pre>
<p>Enjoy!</p>
<p>Next time we can go through how to compile a new kernel and have a different rootfs (potentially using Alpine).</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 22 Nov 2020 14:25:21 +0100</pubDate>
    </item>
    <item>
      <title>Why I chose Fsharp for our AWS Lambda project</title>
      <link>https://dev.l1x.be/posts/2020/05/08/why-i-chose-fsharp-for-our-aws-lambda-project/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/05/08/why-i-chose-fsharp-for-our-aws-lambda-project/</guid>
      <content:encoded><![CDATA[<h1 id="why-i-chose-fsharp-for-our-aws-lambda-project"><a href="#why-i-chose-fsharp-for-our-aws-lambda-project">Why I chose Fsharp for our AWS Lambda project</a></h1>
<h2 id="the-dilema"><a href="#the-dilema">The dilema</a></h2>
<p>I wanted to create a simple Lambda function to be able to track how our users use the website and the web application without a 3rd party and a ton of external dependencies, especially avoiding 3rd party Javascript and leaking out data to mass surveillance companies. The easiest way is to use a simple tracking 1x1 pixel or beacon that collects just the right amount of information (strictly non-PII). This gives us enough information for creating basic funnels, that covers most of our needs.</p>
<h2 id="first-option-python"><a href="#first-option-python">First Option: Python</a></h2>
<p>My default language (regardless of what I am going to work on) is Python. It has many great features and it is easy to prototype in it and the performance is great once you are using a C++ or Rust backed library. This also introduces a few issues when you are trying to deploy to AWS Lambda. I develop mainly on macOS and Lambda runs on Linux. Once you need to compile anything it is hard to get it right because Python does not support compiling to a different platform.</p>
<p><a href="">https://stackoverflow.com/questions/44490197/how-to-cross-compile-python-packages-with-pip</a></p>
<p>I was running into packaging issues because on Mac it is not easy to cross-compile and package Python code, maybe if I would create a proper package but I could not find a simple way without Docket. It would extremely valuable if Python had a way to compile a package that you upload to AWS and it works, 100%. I was running into problems that it was working on my Mac and did not work on AWS. I haven't had enough time to investigate.</p>
<h2 id="second-option-rust"><a href="#second-option-rust">Second Option: Rust</a></h2>
<p>Rust became the rising star over the years and I try to use it as much as possible with mixed success. My biggest problem is with Rust the low-level nature and the quirky features, that are hard to reason about. From AWS Lambda examples:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-rust" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #ff7b72;">use</span> <span style="color: #ff7b72;">lambda</span><span style="color: #e6edf3;">::</span><span style="color: #e6edf3;">handler_fn</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="2"><span style="color: #ff7b72;">use</span> <span style="color: #ff7b72;">serde_json</span><span style="color: #e6edf3;">::</span><span style="color: #ffa657;">Value</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="3">
</div><div class="line" data-line="4"><span style="color: #ff7b72;">type</span> <span style="color: #ffa657;">Error</span> <span style="color: #79c0ff;">=</span> <span style="color: #ffa657;">Box</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ff7b72;">dyn</span> <span style="color: #ff7b72;">std</span><span style="color: #e6edf3;">::</span><span style="color: #ff7b72;">error</span><span style="color: #e6edf3;">::</span><span style="color: #ffa657;">Error</span> <span style="color: #79c0ff;">+</span> <span style="color: #ffa657;">Send</span> <span style="color: #79c0ff;">+</span> <span style="color: #ffa657;">Sync</span> <span style="color: #79c0ff;">+</span> <span style="color: #ff7b72;">&#39;</span><span style="color: #e6edf3;">static</span><span style="color: #e6edf3;">&gt;</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="5">
</div><div class="line" data-line="6"><span style="color: #e6edf3;">#</span><span style="color: #e6edf3;">[</span><span style="color: #ff7b72;">tokio</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">main</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="7"><span style="color: #ff7b72;">async</span> <span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">main</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Result</span><span style="color: #e6edf3;">&lt;</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Error</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="8">    <span style="color: #ff7b72;">let</span> <span style="color: #e6edf3;">func</span> <span style="color: #79c0ff;">=</span> <span style="color: #d2a8ff;">handler_fn</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">func</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="9">    <span style="color: #ff7b72;">lambda</span><span style="color: #e6edf3;">::</span><span style="color: #d2a8ff;">run</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">func</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">.</span><span style="color: #ff7b72;">await</span><span style="color: #79c0ff;">?</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="10">    <span style="color: #79c0ff;">Ok</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">)</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="12">
</div><div class="line" data-line="13"><span style="color: #ff7b72;">async</span> <span style="color: #ff7b72;">fn</span> <span style="color: #d2a8ff;">func</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">event</span><span style="color: #e6edf3;">:</span> <span style="color: #ffa657;">Value</span><span style="color: #e6edf3;">)</span> <span style="color: #e6edf3;">-&gt;</span> <span style="color: #ffa657;">Result</span><span style="color: #e6edf3;">&lt;</span><span style="color: #ffa657;">Value</span><span style="color: #e6edf3;">,</span> <span style="color: #ffa657;">Error</span><span style="color: #e6edf3;">&gt;</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="14">    <span style="color: #79c0ff;">Ok</span><span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">event</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="15"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>Do you think that everybody understands immediately what is going on here? I don't. Even if I do, how am I going to explain this to a junior dev? How long does it take to get productive in Rust? I know that for extreme performance we might need this, but our current application is super happy without Rust, we do not have a performance problem. It is more important that developers are productive and the code is super simple to understand.</p>
<h2 id="and-the-winner-is-fsharp"><a href="#and-the-winner-is-fsharp">And the winner is: Fsharp</a></h2>
<p>Member of the ML family, running on the .NET platform, pretty mature ecosystem. Developers can pick up quickly, especially the way we use it, simple functions will do with small types. The performance is great out of the box, in case you need more you have great tooling around it.</p>
<p>Our handler function:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">  let handler(request:APIGatewayProxyRequest) =
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">    let httpResource =
</div><div class="line" data-line="4">      match isNull request.Resource with
</div><div class="line" data-line="5">      | true  -&gt; &quot;None&quot;
</div><div class="line" data-line="6">      | _     -&gt; request.Resource
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">    let httpMethod =
</div><div class="line" data-line="9">      match isNull request.HttpMethod with
</div><div class="line" data-line="10">      | true  -&gt; &quot;None&quot;
</div><div class="line" data-line="11">      | _     -&gt; request.HttpMethod
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">    let httpHeadersAccept =
</div><div class="line" data-line="14">      match isNull request.Headers with
</div><div class="line" data-line="15">      | true  -&gt; &quot;None&quot;
</div><div class="line" data-line="16">      | _     -&gt; getOrDefault request.Headers  &quot;Accept&quot; &quot;None&quot;
</div><div class="line" data-line="17">
</div><div class="line" data-line="18">    let acceptImage =
</div><div class="line" data-line="19">      let pattern = @&quot;image/&quot;
</div><div class="line" data-line="20">      let m = Regex.Match(httpHeadersAccept, pattern)
</div><div class="line" data-line="21">      m.Success
</div><div class="line" data-line="22">
</div><div class="line" data-line="23">    let log = String.Format(&quot;&lbrace;0&rbrace; :: &lbrace;1&rbrace; :: &lbrace;2&rbrace;&quot;, httpResource, httpMethod, httpHeadersAccept)
</div><div class="line" data-line="24">    LambdaLogger.Log(log)
</div><div class="line" data-line="25">    match (httpResource, httpMethod, httpHeadersAccept, acceptImage) with
</div><div class="line" data-line="26">    | (&quot;/trck&quot;,         &quot;POST&quot;, &quot;application/json&quot;, _    ) -&gt; trckPost(request)
</div><div class="line" data-line="27">    | (&quot;/trck&quot;,         &quot;GET&quot;,  _,                  true ) -&gt; trckGet(request)
</div><div class="line" data-line="28">    | (&quot;/trck/&lbrace;image&rbrace;&quot;, &quot;GET&quot;,  _,                  true ) -&gt; trckGet(request)
</div><div class="line" data-line="29">    | (&quot;/echo&quot;,         &quot;GET&quot;,  _,                  _    ) -&gt; echoGet(request)
</div><div class="line" data-line="30">    | (_,               _,      _,                  _    ) -&gt; notFound(request)
</div></code></pre>
<p>Pretty readable code, sure, you have to deal with nulls but Fsharp gives you great tooling around it. It took me probably a couple of days from having zero experience with .NET to deploy the first working API that has all of the functionality we are looking for. I might not have idiomatic Fsharp yet, but I am happy with the results so far. In the last couple of weeks, I have written many small tools in Fsharp, mostly dealing with the AWS APIs, I like it so much that I replaced my Python first approach and I go and try to implement everything in F# first. I can develop at the same pace as with Python but the result is much more solid code and easier on deployments (goodbye pip).</p>
<p>I think Fsharp is exactly in the sweet spot of programming languages, good enough performance, nice enough features, and a ton of great libraries. It does not have the problem that Python suffers, you can create a single zip that will work on all platforms. It also free from exposing the low-level details that I do not want to care about in business domain code, what Rust does.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 08 May 2020 14:31:21 +0200</pubDate>
    </item>
    <item>
      <title>How long will the world’s uranium supplies last?</title>
      <link>https://dev.l1x.be/posts/2020/05/01/how-long-will-the-worlds-uranium-supplies-last/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/05/01/how-long-will-the-worlds-uranium-supplies-last/</guid>
      <content:encoded><![CDATA[<h2 id="basics"><a href="#basics">Basics</a></h2>
<blockquote>
<p>With a complete combustion or fission, approx. 8 kWh of heat can be generated from 1 kg of coal, approx. 12 kWh from 1 kg of mineral oil and around 24,000,000 kWh from 1 kg of uranium-235.
Related to one kilogram, uranium-235 contains two to three million times the energy equivalent of oil or coal. The illustration shows how much coal, oil or natural uranium is required for a
certain quantity of electricity. Thus, 1 kg natural uranium - following a corresponding enrichment and used for power generation in light water reactors - corresponds to nearly 10,000 kg of
mineral oil or 14,000 kg of coal and enables the generation of 45,000 kWh of electricity.</p>
</blockquote>
<ul>
<li>
<p>1kg natural uranium -&gt; 45,000 kWh</p>
</li>
<li>
<p>1 kg U235 -&gt; 24,000,000 kWh</p>
</li>
</ul>
<p><a href="https://www.euronuclear.org/glossary/fuel-comparison/">https://www.euronuclear.org/glossary/fuel-comparison/</a></p>
<h2 id="military-warheads-as-a-source-of-nuclear-fuel"><a href="#military-warheads-as-a-source-of-nuclear-fuel">Military Warheads as a Source of Nuclear Fuel</a></h2>
<blockquote>
<p>Weapons-grade uranium and plutonium surplus to military requirements in the USA and Russia is being made available for use as civil fuel.
Weapons-grade uranium is highly enriched, to over 90% U-235 (the fissile isotope). Weapons-grade plutonium has over 93% Pu-239 and can be used, like reactor-grade plutonium, in fuel for electricity production.
Highly-enriched uranium from weapons stockpiles has been displacing some 8850 tonnes of U3O8 production from mines each year, and met about 13% to 19% of world reactor requirements through to 2013.</p>
</blockquote>
<p><a href="http://www.world-nuclear.org/information-library/nuclear-fuel-cycle/uranium-resources/military-warheads-as-a-source-of-nuclear-fuel.aspx">http://www.world-nuclear.org/information-library/nuclear-fuel-cycle/uranium-resources/military-warheads-as-a-source-of-nuclear-fuel.aspx</a></p>
<h2 id="in-nature"><a href="#in-nature">In nature</a></h2>
<blockquote>
<p>In nature, uranium is found as uranium-238 (99.2739–99.2752%), uranium-235 (0.7198–0.7202%), and a very small amount of uranium-234 (0.0050–0.0059%).</p>
</blockquote>
<p><a href="https://en.wikipedia.org/wiki/Uranium">https://en.wikipedia.org/wiki/Uranium</a></p>
<h2 id="enrichment"><a href="#enrichment">Enrichment</a></h2>
<blockquote>
<p>Highly enriched uranium (HEU) has a 20% or higher concentration of 235U. The fissile uranium in nuclear weapon primaries usually contains 85% or more of 235U known as weapons-grade, though theoretically for an implosion design, a minimum of 20% could be sufficient (called weapon(s)-usable) although it would require hundreds of kilograms of material and “would not be practical to design”</p>
</blockquote>
<p><a href="https://en.wikipedia.org/wiki/Enriched_uranium#Highly_enriched_uranium_.28HEU.29">https://en.wikipedia.org/wiki/Enriched_uranium#Highly_enriched_uranium_.28HEU.29</a></p>
<blockquote>
<p>Highly-enriched uranium in US and Russian weapons and other military stockpiles amounts to about 1500 tonnes</p>
</blockquote>
<p><a href="http://www.world-nuclear.org/information-library/nuclear-fuel-cycle/uranium-resources/military-warheads-as-a-source-of-nuclear-fuel.aspx">http://www.world-nuclear.org/information-library/nuclear-fuel-cycle/uranium-resources/military-warheads-as-a-source-of-nuclear-fuel.aspx</a></p>
<ul>
<li>
<p>1500 tonnes -&gt;  300 .. 1275 tonnes U235 -&gt; 300000 .. 1275000 kg</p>
</li>
<li>
<p>24000000 kilowatt hours = 0.0240000000000 terawatt hours</p>
</li>
<li>
<p>7200 .. 30600 TWh</p>
</li>
</ul>
<h2 id="energy-in-the-us"><a href="#energy-in-the-us">Energy in the US</a></h2>
<blockquote>
<p>Primary energy use in the United States was 25,155 TWh or about 81,800 kWh per person in 2009. Primary energy use was 1,100 TWh less in the US than in China in 2009.</p>
</blockquote>
<p><a href="https://en.wikipedia.org/wiki/Energy_in_the_United_States#Consumption">https://en.wikipedia.org/wiki/Energy_in_the_United_States#Consumption</a></p>
<ul>
<li>
<p>US energy consumption per year: 25,155 TWh [2009]</p>
</li>
<li>
<p>China energy consumption per year: 1,100 TWh [2009]</p>
</li>
<li>
<p>US: 0.28 .. 1.21 year</p>
</li>
<li>
<p>China 6.54 .. 27.81 year</p>
</li>
</ul>
<h2 id="how-long-will-the-worlds-uranium-supplies-last"><a href="#how-long-will-the-worlds-uranium-supplies-last">How long will the world’s uranium supplies last?</a></h2>
<p>200 years</p>
<blockquote>
<p>If the Nuclear Energy Agency (NEA) has accurately estimated the planet’s economically accessible uranium resources, reactors could run more than 200 years at current rates of consumption.</p>
</blockquote>
<p><a href="https://www.scientificamerican.com/article/how-long-will-global-uranium-deposits-last/">https://www.scientificamerican.com/article/how-long-will-global-uranium-deposits-last/</a></p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 01 May 2020 14:31:21 +0200</pubDate>
    </item>
    <item>
      <title>FreeNAS 11.3 upgrade issues</title>
      <link>https://dev.l1x.be/posts/2020/04/29/freenas-11.3-upgrade-issues/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/04/29/freenas-11.3-upgrade-issues/</guid>
      <content:encoded><![CDATA[<p>I have an interesting experience with the latest upgrade of FreeNAS 11.3-U2.1. Applications that were deployed in the jail were gone. After fiddling with iocage (the tool that FreeNAS provides to manage jails) I could restore a previous state where all seems fine and dandy.</p>
<h2 id="steps-to-restore"><a href="#steps-to-restore">Steps to restore</a></h2>
<ul>
<li>list of snapshots</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">iocage</span> <span style="color: #e6edf3;">snaplist</span> <span style="color: #e6edf3;">tr</span> <span style="color: #e6edf3;">-l</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div><div class="line" data-line="3"><span style="color: #79c0ff;">|</span>                                  <span style="color: #d2a8ff;">NAME</span>                                   <span style="color: #79c0ff;">|</span>        <span style="color: #d2a8ff;">CREATED</span>        <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">RSIZE</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">USED</span>  <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">+=========================================================================+=======================+=======+=======+</span>
</div><div class="line" data-line="5"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">fux/iocage/jails/tr/root@ioc_plugin_update_2020-04-29</span>                   <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">Wed</span> <span style="color: #e6edf3;">Apr</span> <span style="color: #79c0ff;">29</span> <span style="color: #e6edf3;">14:55</span> <span style="color: #79c0ff;">2020</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">799G</span>  <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">11.6K</span> <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div><div class="line" data-line="7"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">fux/iocage/jails/tr/root@ioc_update_11.3-RELEASE-p8_2020-04-29_14-55-07</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">Wed</span> <span style="color: #e6edf3;">Apr</span> <span style="color: #79c0ff;">29</span> <span style="color: #e6edf3;">14:55</span> <span style="color: #79c0ff;">2020</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">799G</span>  <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">575K</span>  <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div><div class="line" data-line="9"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">fux/iocage/jails/tr/root@ioc_update_11.3-RELEASE-p8_2020-04-29_14-55-22</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">Wed</span> <span style="color: #e6edf3;">Apr</span> <span style="color: #79c0ff;">29</span> <span style="color: #e6edf3;">14:55</span> <span style="color: #79c0ff;">2020</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">799G</span>  <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">11.6K</span> <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div><div class="line" data-line="11"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">fux/iocage/jails/tr@ioc_plugin_update_2020-04-29</span>                        <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">Wed</span> <span style="color: #e6edf3;">Apr</span> <span style="color: #79c0ff;">29</span> <span style="color: #e6edf3;">14:55</span> <span style="color: #79c0ff;">2020</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">517K</span>  <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">81.4K</span> <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="12"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div><div class="line" data-line="13"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">fux/iocage/jails/tr@ioc_update_11.3-RELEASE-p8_2020-04-29_14-55-07</span>      <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">Wed</span> <span style="color: #e6edf3;">Apr</span> <span style="color: #79c0ff;">29</span> <span style="color: #e6edf3;">14:55</span> <span style="color: #79c0ff;">2020</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">517K</span>  <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">81.4K</span> <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="14"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div><div class="line" data-line="15"><span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">fux/iocage/jails/tr@ioc_update_11.3-RELEASE-p8_2020-04-29_14-55-22</span>      <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">Wed</span> <span style="color: #e6edf3;">Apr</span> <span style="color: #79c0ff;">29</span> <span style="color: #e6edf3;">14:55</span> <span style="color: #79c0ff;">2020</span> <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">517K</span>  <span style="color: #79c0ff;">|</span> <span style="color: #d2a8ff;">81.4K</span> <span style="color: #79c0ff;">|</span>
</div><div class="line" data-line="16"><span style="color: #d2a8ff;">+-------------------------------------------------------------------------+-----------------------+-------+-------+</span>
</div></code></pre>
<ul>
<li>stop the jail</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">root</span><span style="color: #e6edf3;">@freenas</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">~</span>]<span style="color: #8b949e;"># iocage stop tr</span>
</div><div class="line" data-line="2">* Stopping tr
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">+</span> Executing prestop OK
</div><div class="line" data-line="4">  <span style="color: #79c0ff;">+</span> Stopping services OK
</div><div class="line" data-line="5">  <span style="color: #79c0ff;">+</span> Tearing down VNET OK
</div><div class="line" data-line="6">  <span style="color: #79c0ff;">+</span> Removing devfs_ruleset: <span style="color: #79c0ff;">5</span> OK
</div><div class="line" data-line="7">  <span style="color: #79c0ff;">+</span> Removing jail process OK
</div><div class="line" data-line="8">  <span style="color: #79c0ff;">+</span> Executing poststop OK
</div></code></pre>
<ul>
<li>rollback</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">root</span><span style="color: #e6edf3;">@freenas</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">~</span>]<span style="color: #8b949e;"># iocage rollback tr -n ioc_update_11.3-RELEASE-p8_2020-04-29_14-55-07</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">This will destroy ALL data created including ALL snapshots taken after the snapshot ioc_update_11.3-RELEASE-p8_2020-04-29_14-55-07
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">Are you sure? <span style="color: #e6edf3;">[</span>y/N<span style="color: #e6edf3;">]</span>: y
</div><div class="line" data-line="6">Rolled back to: fux/iocage/jails/tr
</div></code></pre>
<ul>
<li>start the jail</li>
</ul>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">root</span><span style="color: #e6edf3;">@freenas</span><span style="color: #e6edf3;">[</span><span style="color: #79c0ff;">~</span>]<span style="color: #8b949e;"># iocage start tr</span>
</div><div class="line" data-line="2">No default gateway found <span style="color: #ff7b72;">for</span> <span style="color: #e6edf3;">ipv6</span><span style="color: #d2a8ff;">.</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">*</span> <span style="color: #e6edf3;">Starting</span> <span style="color: #e6edf3;">tr</span>
</div><div class="line" data-line="4">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">Started</span> <span style="color: #e6edf3;">OK</span>
</div><div class="line" data-line="5">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">devfs_ruleset:</span> <span style="color: #79c0ff;">5</span>
</div><div class="line" data-line="6">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">Configuring</span> <span style="color: #e6edf3;">VNET</span> <span style="color: #e6edf3;">OK</span>
</div><div class="line" data-line="7">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">IP</span> <span style="color: #e6edf3;">options:</span> <span style="color: #e6edf3;">vnet</span>
</div><div class="line" data-line="8">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">Starting</span> <span style="color: #e6edf3;">services</span> <span style="color: #e6edf3;">OK</span>
</div><div class="line" data-line="9">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">Executing</span> <span style="color: #e6edf3;">poststart</span> <span style="color: #e6edf3;">OK</span>
</div><div class="line" data-line="10">  <span style="color: #d2a8ff;">+</span> <span style="color: #e6edf3;">DHCP</span> <span style="color: #e6edf3;">Address:</span> <span style="color: #e6edf3;">192.168.1.111/24</span>
</div></code></pre>
<p>And we are back.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 29 Apr 2020 14:31:21 +0200</pubDate>
    </item>
    <item>
      <title>Matching binary patterns</title>
      <link>https://dev.l1x.be/posts/2020/04/29/matching-binary-patterns/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2020/04/29/matching-binary-patterns/</guid>
      <content:encoded><![CDATA[<h1 id="matching-binary-patterns"><a href="#matching-binary-patterns">Matching binary patterns</a></h1>
<p>In Erlang, it is easy to construct binaries and bitstrings and matching binary patterns. I was running into Mitchell Perilstein's excellent work on NTP with Erlang and I thought I am going to use this to explain how bitstrings and binaries work in Erlang.</p>
<p>Two concepts:</p>
<ul>
<li>
<p>A bitstring is a sequence of zero or more bits, where the number of bits does not need to be divisible by 8.</p>
</li>
<li>
<p>A binary is when the number of bits is divisible by 8.</p>
</li>
</ul>
<p>The syntax is as follows:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1"> &lt;&lt;B1, B2, ... Bn&gt;&gt;
</div></code></pre>
<p>Each element specifies a certain segment of the bitstring. A segment is a set of contiguous bits of the binary (not necessarily on a byte boundary).</p>
<p>A real-life example:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1"> &lt;&lt; 0:2, 4:3, 3:3,  0:(3*8 + 3*32 + 4*64) &gt;&gt;.
</div></code></pre>
<p>Let's unpack a bit of what is going on here. For this, it is worth knowing the whole syntax.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">&lt;&lt; Value:Size/TypeSpecifierList, Value:Size/TypeSpecifierList, ...&gt;&gt;
</div></code></pre>
<p>Or alternatively:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Ei = Value |
</div><div class="line" data-line="2">     Value:Size |
</div><div class="line" data-line="3">     Value/TypeSpecifierList |
</div><div class="line" data-line="4">     Value:Size/TypeSpecifierList
</div></code></pre>
<p>This means in the real-life example, we have 0 as the value, 2 is the size (2 bits), four as a value, 3 bits as size, and so on. We did not specify any of the type specifiers.</p>
<p>TypeSpecifierList is a list of type specifiers, in any order, separated by hyphens or dash (-). Default values are used for any omitted type specifier.</p>
<p>The following type specifiers are supported:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Type = integer | float | binary | bytes | bitstring | bits | utf8 | utf16 | utf32
</div></code></pre>
<p>The default is an integer. bytes is a shorthand for binary and bits is a shorthand for bitstring.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Signedness= signed | unsigned
</div></code></pre>
<p>It only matters for matching and when the type is an integer. The default is unsigned.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Endianness= big | little | native
</div></code></pre>
<p>Native-endian means that the endianness is resolved at load time to be either big-endian or little-endian, depending on what is native for the CPU that the Erlang machine is run on. Endianness only matters when the Type is either integer, utf16, utf32, or float. The default is big.</p>
<h2 id="a-complete-example"><a href="#a-complete-example">A complete example</a></h2>
<p>One of the simplest protocols out there is NTP. The header file looks like the following:</p>
<p><img src="https://dev-to-uploads.s3.amazonaws.com/i/imigpr35l0uhlkpbbh74.png" alt="Alt Text" /></p>
<p>This is used for both the request and the response. Let's craft the request first.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">create_ntp_request() -&gt;
</div><div class="line" data-line="2">  &lt;&lt; 0:2, 4:3, 3:3,  0:(3*8 + 3*32 + 4*64) &gt;&gt;.
</div></code></pre>
<p>Based on the header structure we can see that we have a 2-bit integer (Li), 3-bit integer version number, 3-bit integer mode, 8-bit stratum, 8-bit poll, 8-bit precision, and so on. We only need to set the first 3 values, the rest (376 bits) can be 0.</p>
<p>Let's try this in the wild.</p>
<h3 id="creating-the-request"><a href="#creating-the-request">Creating the request</a></h3>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">1&gt; Request = &lt;&lt; 0:2, 4:3, 3:3,  0:(3*8 + 3*32 + 4*64) &gt;&gt;.
</div><div class="line" data-line="2">&lt;&lt;35,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
</div><div class="line" data-line="3">  0,0,...&gt;&gt;
</div></code></pre>
<h3 id="sending-and-receiving"><a href="#sending-and-receiving">Sending and receiving</a></h3>
<p>We can use Erlang's built-in functions for this one, gen_udp has a pretty comprehensive low-level UDP implementation, that can do all we want.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">
</div><div class="line" data-line="2">% open a local socket, 0 indicates that it will pick a random local port
</div><div class="line" data-line="3">% active=false means we need to receive ourselves
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">2&gt; &lbrace;ok, Socket&rbrace; = gen_udp:open(0, [binary, &lbrace;active, false&rbrace;]),
</div><div class="line" data-line="6">2&gt; gen_udp:send(Socket, &quot;0.europe.pool.ntp.org&quot;, 123, Request),
</div><div class="line" data-line="7">2&gt; &lbrace;ok, &lbrace;_Address, _Port, Resp&rbrace;&rbrace; = gen_udp:recv(Socket, 0, 500).
</div><div class="line" data-line="8">&lbrace;ok,&lbrace;&lbrace;212,59,0,1&rbrace;,
</div><div class="line" data-line="9">     123,
</div><div class="line" data-line="10">     &lt;&lt;36,2,0,231,0,0,0,110,0,0,0,25,212,59,3,3,226,84,62,89,
</div><div class="line" data-line="11">       208,192,202,156,...&gt;&gt;&rbrace;&rbrace;
</div></code></pre>
<h3 id="processing-the-response-first-few-bits"><a href="#processing-the-response-first-few-bits">Processing the response, first few bits</a></h3>
<p>The response is just a binary that we need to slice and dice, similarly how we created the request.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">3&gt; Resp.
</div><div class="line" data-line="2">&lt;&lt;36,2,0,231,0,0,0,110,0,0,0,25,212,59,3,3,226,84,62,89,
</div><div class="line" data-line="3">  208,192,202,156,0,0,0,0,0,...&gt;&gt;
</div></code></pre>
<p>First, we can just get the first few bits.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">4&gt; &lt;&lt; Li:2, Version:3, Mode:3, _rest/binary &gt;&gt; = Resp.
</div><div class="line" data-line="2">&lt;&lt;36,2,0,231,0,0,0,110,0,0,0,25,212,59,3,3,226,84,62,89,
</div><div class="line" data-line="3">  208,192,202,156,0,0,0,0,0,...&gt;&gt;
</div><div class="line" data-line="4">5&gt; &lbrace;li, Li, version, Version, mode, Mode&rbrace;.
</div><div class="line" data-line="5">&lbrace;li,0,version,4,mode,4&rbrace;
</div></code></pre>
<p>It works.</p>
<p>The rest of the header a bit more tricky but with the bitstring syntax, it is easy to manage.</p>
<h3 id="processing-the-response-the-rest"><a href="#processing-the-response-the-rest">Processing the response, the rest</a></h3>
<p>Finally matching all the mandatory fields.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">6&gt;   &lt;&lt; LI:2, Version:3, Mode:3, Stratum:8, Poll:8/signed, Precision:8/signed,
</div><div class="line" data-line="2">6&gt;      RootDel:32, RootDisp:32, R1:8, R2:8, R3:8, R4:8, RtsI:32, RtsF:32,
</div><div class="line" data-line="3">6&gt;      OtsI:32, OtsF:32,   RcvI:32, RcvF:32, XmtI:32, XmtF:32 &gt;&gt; = Resp.
</div><div class="line" data-line="4">&lt;&lt;36,2,0,231,0,0,0,110,0,0,0,25,212,59,3,3,226,84,62,89,
</div><div class="line" data-line="5">  208,192,202,156,0,0,0,0,0,...&gt;&gt;
</div></code></pre>
<p>Making sense of these values requires a bit more legwork. First, we need a utility function for binary fractions.</p>
<p>In Erlang, function arity differentiates functions so we can do the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">binfrac(Bin) -&gt;
</div><div class="line" data-line="2">  binfrac(Bin, 2, 0).
</div><div class="line" data-line="3">binfrac(0, _, Frac) -&gt;
</div><div class="line" data-line="4">  Frac;
</div><div class="line" data-line="5">binfrac(Bin, N, Frac) -&gt;
</div><div class="line" data-line="6">  binfrac(Bin bsr 1, N*2, Frac + (Bin band 1)/N).
</div></code></pre>
<p>With this function, we can implement the one that processes the response and returns the values we are interested in.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">% 2208988800 is the offset (1900 to Unix epoch)
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">process_ntp_response(Ntp_response) -&gt;
</div><div class="line" data-line="4">  &lt;&lt; LI:2, Version:3, Mode:3, Stratum:8, Poll:8/signed, Precision:8/signed,
</div><div class="line" data-line="5">     RootDel:32, RootDisp:32, R1:8, R2:8, R3:8, R4:8, RtsI:32, RtsF:32,
</div><div class="line" data-line="6">     OtsI:32, OtsF:32,   RcvI:32, RcvF:32, XmtI:32, XmtF:32 &gt;&gt; = Ntp_response,
</div><div class="line" data-line="7">  &lbrace;NowMS, NowS, NowUS&rbrace; = erlang:timestamp(),
</div><div class="line" data-line="8">  NowTimestamp = NowMS * 1.0e6 + NowS + NowUS/1000,
</div><div class="line" data-line="9">  TransmitTimestamp = XmtI - 2208988800 + binfrac(XmtF),
</div><div class="line" data-line="10">  &lbrace; &lbrace;li, LI&rbrace;, &lbrace;vn, Version&rbrace;, &lbrace;mode, Mode&rbrace;, &lbrace;stratum, Stratum&rbrace;, &lbrace;poll, Poll&rbrace;, &lbrace;precision, Precision&rbrace;,
</div><div class="line" data-line="11">    &lbrace;rootDelay, RootDel&rbrace;, &lbrace;rootDispersion, RootDisp&rbrace;, &lbrace;referenceId, R1, R2, R3, R4&rbrace;,
</div><div class="line" data-line="12">    &lbrace;referenceTimestamp, RtsI - 2208988800 + binfrac(RtsF)&rbrace;,
</div><div class="line" data-line="13">    &lbrace;originateTimestamp, OtsI - 2208988800 + binfrac(OtsF)&rbrace;,
</div><div class="line" data-line="14">    &lbrace;receiveTimestamp,   RcvI - 2208988800 + binfrac(RcvF)&rbrace;,
</div><div class="line" data-line="15">    &lbrace;transmitTimestamp,  TransmitTimestamp&rbrace;,
</div><div class="line" data-line="16">    &lbrace;clientReceiveTimestamp, NowTimestamp&rbrace;,
</div><div class="line" data-line="17">    &lbrace;offset, TransmitTimestamp - NowTimestamp&rbrace; &rbrace;.
</div></code></pre>
<p>And wit that we can just process the response.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">&lbrace;&lbrace;li,0&rbrace;,
</div><div class="line" data-line="2"> &lbrace;vn,4&rbrace;,
</div><div class="line" data-line="3"> &lbrace;mode,4&rbrace;,
</div><div class="line" data-line="4"> &lbrace;stratum,2&rbrace;,
</div><div class="line" data-line="5"> &lbrace;poll,3&rbrace;,
</div><div class="line" data-line="6"> &lbrace;precision,-24&rbrace;,
</div><div class="line" data-line="7"> &lbrace;rootDelay,9&rbrace;,
</div><div class="line" data-line="8"> &lbrace;rootDispersion,140&rbrace;,
</div><div class="line" data-line="9"> &lbrace;referenceId,85,158,25,75&rbrace;,
</div><div class="line" data-line="10"> &lbrace;referenceTimestamp,1588186010.7517557&rbrace;,
</div><div class="line" data-line="11"> &lbrace;originateTimestamp,-2208988800&rbrace;,
</div><div class="line" data-line="12"> &lbrace;receiveTimestamp,1588186048.3557627&rbrace;,
</div><div class="line" data-line="13"> &lbrace;transmitTimestamp,1588186048.8841336&rbrace;,
</div><div class="line" data-line="14"> &lbrace;clientReceiveTimestamp,1588186606.531&rbrace;,
</div><div class="line" data-line="15"> &lbrace;offset,-557.6468663215637&rbrace;&rbrace;
</div></code></pre>
<p>Please note, this is the first step in the NTP workflow and does not implement the complete NTP protocol. We do not take into consideration a bunch of details.</p>
<p>Next time we might look into how to implement a simple server (like DNS) in Erlang.</p>
<p>Michael's original work:</p>
<p><a href="https://github.com/mnp/erlang-ntp">https://github.com/mnp/erlang-ntp</a></p>
<p>Up to date version and Elixir port:</p>
<p><a href="https://gist.github.com/l1x/b0a7f844b283ac08e3125d1ba6e81eeb">https://gist.github.com/l1x/b0a7f844b283ac08e3125d1ba6e81eeb</a></p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 29 Apr 2020 14:31:21 +0200</pubDate>
    </item>
    <item>
      <title>Data-cat</title>
      <link>https://dev.l1x.be/projects/data-cat/</link>
      <guid isPermaLink="true">https://dev.l1x.be/projects/data-cat/</guid>
      <content:encoded><![CDATA[<h1 id="data-cat"><a href="#data-cat">Data-cat</a></h1>
<p>Deploying DataDog for a large scale infrastructure</p>
<h2 id="definitions"><a href="#definitions">Definitions</a></h2>
<ul>
<li>Geographic Regions</li>
<li>Stages</li>
<li>Applications</li>
</ul>
<h3 id="geographic-regions"><a href="#geographic-regions">Geographic Regions</a></h3>
<p>Matches the definitions of AWS Regions. It can be used for GCP or on-prem datacenter as well.</p>
<h3 id="stages"><a href="#stages">Stages</a></h3>
<p>Different stages of application deployments, usually: dev, qa, prod.</p>
<h3 id="applications"><a href="#applications">Applications</a></h3>
<p>A service that provides a distinct business functionality.</p>
<h2 id="goals"><a href="#goals">Goals</a></h2>
<ul>
<li>having all monitors and dashboards in version control</li>
<li>having all monitors templated</li>
<li>being able to address smaller parts of the infrastructure</li>
</ul>
<h2 id="implementation"><a href="#implementation">Implementation</a></h2>
<p>4 files represent the DataDog configuration for the whole infrastructure.</p>
<ul>
<li>infrastructure.yaml</li>
</ul>
<p>It contains the logical grouping of applications into stages and regions. The relations are always N:M. 1 region can contain many stages and many applications in each stage.</p>
<ul>
<li>region.yaml</li>
</ul>
<p>Defaults for a certain region (region).</p>
<ul>
<li>stage.yaml</li>
</ul>
<p>Defaults for a certain stage (region, stage).</p>
<ul>
<li>application.yaml</li>
</ul>
<p>Configuration that is specific for a certain application (region, stage, application).</p>
<h3 id="generating-infrastructureyaml"><a href="#generating-infrastructureyaml">Generating infrastructure.yaml</a></h3>
<p>I recently discovered <a href="https://dhall-lang.org">Dhall</a> that seems like the perfect fit to write the infrastructure in and than generate the YAML files.</p>
<p>The type safe definitions looks like the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">let keyValue =
</div><div class="line" data-line="2">        λ(k : Type)
</div><div class="line" data-line="3">      → λ(v : Type)
</div><div class="line" data-line="4">      → λ(mapKey : k)
</div><div class="line" data-line="5">      → λ(mapValue : v)
</div><div class="line" data-line="6">      → &lbrace; mapKey = mapKey, mapValue = mapValue &rbrace;
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">let ApplicationConfig : Type = &lbrace; created_at : Text &rbrace;
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">let Application = &lt; etcd | postgresql | hadoop &gt;
</div><div class="line" data-line="11">let Applications = Prelude.Map.Type Application ApplicationConfig
</div><div class="line" data-line="12">let application = keyValue Application ApplicationConfig
</div><div class="line" data-line="13">
</div><div class="line" data-line="14">let Stage = &lt; dev | qa | prod &gt;
</div><div class="line" data-line="15">let Stages = Prelude.Map.Type Stage Applications
</div><div class="line" data-line="16">let stage = keyValue Stage Applications
</div><div class="line" data-line="17">
</div><div class="line" data-line="18">let AwsRegion = &lt; us-east-1 | eu-central-1 | eu-west-1 &gt;
</div><div class="line" data-line="19">let AwsRegions = Prelude.Map.Type AwsRegion Stages
</div><div class="line" data-line="20">let awsRegion = keyValue AwsRegion Stages
</div></code></pre>
<p>After having these definitions we can create the infrastructure:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">in  [ awsRegion AwsRegion.us-east-1
</div><div class="line" data-line="2">        [ stage Stage.dev
</div><div class="line" data-line="3">             [ application Application.hadoop &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="4">             , application Application.etcd &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="5">             ]
</div><div class="line" data-line="6">        , stage Stage.qa
</div><div class="line" data-line="7">             [ application Application.hadoop &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="8">             , application Application.etcd &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="9">             ]
</div><div class="line" data-line="10">        ]
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">    , awsRegion AwsRegion.eu-west-1
</div><div class="line" data-line="13">        [ stage Stage.dev
</div><div class="line" data-line="14">             [ application Application.hadoop &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="15">             , application Application.etcd &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="16">             ]
</div><div class="line" data-line="17">        ]
</div><div class="line" data-line="18">    , awsRegion AwsRegion.eu-central-1
</div><div class="line" data-line="19">        [ stage Stage.dev
</div><div class="line" data-line="20">            [ application Application.hadoop &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="21">            , application Application.etcd &lbrace; created_at = &quot;2019-11-04T09:00:00Z&quot; &rbrace;
</div><div class="line" data-line="22">            ]
</div><div class="line" data-line="23">        ]
</div><div class="line" data-line="24">    ]
</div></code></pre>
<p>Generating the YAML:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">dhall-to-yaml</span> <span style="color: #e6edf3;">--file</span> <span style="color: #e6edf3;">infrastructure.dhall</span> <span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">infrastructure.yaml</span>
</div></code></pre>
<h3 id="generating-the-folder-structure"><a href="#generating-the-folder-structure">Generating the folder structure</a></h3>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">python3</span> <span style="color: #e6edf3;">gen.py</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-central-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">dev</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-central-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">dev,</span> <span style="color: #e6edf3;">app:</span> <span style="color: #e6edf3;">etcd</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-central-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">dev,</span> <span style="color: #e6edf3;">app:</span> <span style="color: #e6edf3;">hadoop</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-west-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">dev</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-west-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">dev,</span> <span style="color: #e6edf3;">app:</span> <span style="color: #e6edf3;">etcd</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-west-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">dev,</span> <span style="color: #e6edf3;">app:</span> <span style="color: #e6edf3;">hadoop</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-west-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">prod</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-west-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">prod,</span> <span style="color: #e6edf3;">app:</span> <span style="color: #e6edf3;">etcd</span>
</div><div class="line" data-line="10"><span style="color: #d2a8ff;">region:</span> <span style="color: #e6edf3;">eu-west-1,</span> <span style="color: #e6edf3;">stage:</span> <span style="color: #e6edf3;">prod,</span> <span style="color: #e6edf3;">app:</span> <span style="color: #e6edf3;">hadoop</span>
</div></code></pre>
<h3 id="templates"><a href="#templates">Templates</a></h3>
<p>Templates folder has the monitor templates.</p>
<p>Example template:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-yaml" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">---</span>
</div><div class="line" data-line="2"><span style="color: #79c0ff;">name</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">High CPU load on application_name:&lbrace;application_name&rbrace; stage:&lbrace;stage&rbrace; &lbrace;&lbrace;&lbrace;&lbrace;host.name&rbrace;&rbrace;&rbrace;&rbrace; / &lbrace;&lbrace;&lbrace;&lbrace;host.ip&rbrace;&rbrace;&rbrace;&rbrace;</span>
</div><div class="line" data-line="3"><span style="color: #79c0ff;">tags</span><span style="color: #e6edf3;">:</span>
</div><div class="line" data-line="4">  <span style="color: #e6edf3;">-</span> <span style="color: #a5d6ff;">application_name:&lbrace;application_name&rbrace;</span>
</div><div class="line" data-line="5">  <span style="color: #e6edf3;">-</span> <span style="color: #a5d6ff;">stage:&lbrace;stage&rbrace;</span>
</div><div class="line" data-line="6">  <span style="color: #e6edf3;">-</span> <span style="color: #a5d6ff;">region:&lbrace;region&rbrace;</span>
</div><div class="line" data-line="7"><span style="color: #79c0ff;">type</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">metric alert</span>
</div><div class="line" data-line="8"><span style="color: #79c0ff;">query</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">avg(last_5m):avg:system.load.norm.5&lbrace;&lbrace;application_name:&lbrace;application_name&rbrace;,stage:&lbrace;stage&rbrace;&rbrace;&rbrace; by &lbrace;&lbrace;host&rbrace;&rbrace; &gt; &lbrace;critical_threshold&rbrace;</span>
</div><div class="line" data-line="9"><span style="color: #79c0ff;">message</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;"><span style="color: #e6edf3;">&gt;-2</span></span>
</div><div class="line" data-line="10"><span style="color: #a5d6ff;">  High CPU load on application_name:&lbrace;application_name&rbrace; stage:&lbrace;stage&rbrace; &lbrace;&lbrace;&lbrace;&lbrace;host.name&rbrace;&rbrace;&rbrace;&rbrace; / &lbrace;&lbrace;&lbrace;&lbrace;host.ip&rbrace;&rbrace;&rbrace;&rbrace; for 5 consecutive minutes on this node.</span>
</div><div class="line" data-line="11"><span style="color: #a5d6ff;">  Url: https://wd-global-prod.datadoghq.com/monitors/&lbrace;monitor_id&rbrace;</span>
</div><div class="line" data-line="12"><span style="color: #a5d6ff;">  &lbrace;slack_notification_channel&rbrace;</span>
</div><div class="line" data-line="13"><span style="color: #79c0ff;">monitor_options</span><span style="color: #e6edf3;">:</span>
</div><div class="line" data-line="14">  <span style="color: #79c0ff;">notify_audit</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">False</span>
</div><div class="line" data-line="15">  <span style="color: #79c0ff;">locked</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">False</span>
</div><div class="line" data-line="16">  <span style="color: #79c0ff;">timeout_h</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="17">  <span style="color: #79c0ff;">silenced</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">&lbrace;</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="18">  <span style="color: #79c0ff;">include_tags</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">True</span>
</div><div class="line" data-line="19">  <span style="color: #79c0ff;">require_full_window</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">True</span>
</div><div class="line" data-line="20">  <span style="color: #79c0ff;">new_host_delay</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">300</span>
</div><div class="line" data-line="21">  <span style="color: #79c0ff;">notify_no_data</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">False</span>
</div><div class="line" data-line="22">  <span style="color: #79c0ff;">renotify_interval</span><span style="color: #e6edf3;">:</span> <span style="color: #79c0ff;">0</span>
</div><div class="line" data-line="23">  <span style="color: #79c0ff;">escalation_message</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;"><span style="color: #e6edf3;">&gt;-2</span></span>
</div><div class="line" data-line="24"><span style="color: #a5d6ff;">    CPU load is still damn high.</span>
</div><div class="line" data-line="25">  <span style="color: #79c0ff;">thresholds</span><span style="color: #e6edf3;">:</span>
</div><div class="line" data-line="26">    <span style="color: #79c0ff;">critical</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">critical_threshold</span><span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="27">    <span style="color: #79c0ff;">warning</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #a5d6ff;">warning_threshold</span><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>This gets rendered using Python format and converted to a dict that used to talk to the DataDog API.</p>
<h3 id="defaults-and-specifics"><a href="#defaults-and-specifics">Defaults and specifics</a></h3>
<p>Defaults are stage wide settings specifics are specific to a single application (in a region &amp; stage).</p>
<h3 id="tags-alignment"><a href="#tags-alignment">Tags alignment</a></h3>
<p>For all of these above to work together nicely there is a dependency on tags being deployed every node, ELB, etc., so that we can reference those in monitors and dashboards.</p>
<h2 id="deployment"><a href="#deployment">Deployment</a></h2>
<p>I gave up on Conda and now just using venv from Python.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">/usr/local/opt/python3/bin/python3</span> <span style="color: #e6edf3;">-m</span> <span style="color: #e6edf3;">venv</span> <span style="color: #e6edf3;">venv</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">.</span> <span style="color: #e6edf3;">venv/bin/activate.fish</span> <span style="color: #8b949e;">#or the shell you are using</span>
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">pip</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">--upgrade</span> <span style="color: #e6edf3;">pip</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">pip</span> <span style="color: #e6edf3;">install</span> <span style="color: #e6edf3;">--upgrade</span> <span style="color: #e6edf3;">toml</span> <span style="color: #e6edf3;">pyyaml</span>
</div></code></pre>
<h3 id="deploying-monitors"><a href="#deploying-monitors">Deploying monitors</a></h3>
<p>Deploying a whole stage:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./data-cat/data-cat.py</span> <span style="color: #e6edf3;">deploy-monitors</span> <span style="color: #e6edf3;">-r</span> <span style="color: #e6edf3;">eu-west-1</span> <span style="color: #e6edf3;">-s</span> <span style="color: #e6edf3;">qa</span>
</div></code></pre>
<p>Deploying a single application:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./data-cat/data-cat.py</span> <span style="color: #e6edf3;">deploy-monitors</span> <span style="color: #e6edf3;">-r</span> <span style="color: #e6edf3;">eu-west-1</span> <span style="color: #e6edf3;">-s</span> <span style="color: #e6edf3;">qa</span> <span style="color: #e6edf3;">-a</span> <span style="color: #e6edf3;">etcd</span>
</div></code></pre>
<h3 id="deploying-dashboards"><a href="#deploying-dashboards">Deploying dashboards</a></h3>
<p>Deploying a whole stage:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./data-cat/data-cat.py</span> <span style="color: #e6edf3;">deploy-dashboards</span> <span style="color: #e6edf3;">-r</span> <span style="color: #e6edf3;">eu-west-1</span> <span style="color: #e6edf3;">-s</span> <span style="color: #e6edf3;">qa</span>
</div></code></pre>
<p>Deploying a single application:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">./data-cat/data-cat.py</span> <span style="color: #e6edf3;">deploy-dashboards</span> <span style="color: #e6edf3;">-r</span> <span style="color: #e6edf3;">eu-west-1</span> <span style="color: #e6edf3;">-s</span> <span style="color: #e6edf3;">qa</span> <span style="color: #e6edf3;">-a</span> <span style="color: #e6edf3;">etcd</span>
</div></code></pre>
<h2 id="what-to-monitor"><a href="#what-to-monitor">What to monitor</a></h2>
<p>Following <a href="http://www.brendangregg.com/usemethod.html">Brendan Gregg's use method</a> and the suggested things to monitor:</p>
<ul>
<li>CPUs: sockets, cores, hardware threads (virtual CPUs)</li>
<li>Memory: capacity</li>
<li>Network interfaces</li>
<li>Storage devices: I/O, capacity</li>
<li>Controllers: storage, network cards</li>
<li>Interconnects: CPUs, memory, I/O</li>
</ul>
<p>How to monitor it (examples):</p>
<ul>
<li>utilization: as a percent over a time interval. eg, &quot;one disk is running at 90% utilization&quot;</li>
<li>saturation: as a queue length. eg, &quot;the CPUs have an average run queue length of four&quot;</li>
<li>errors: scalar counts. eg, &quot;this network interface has had fifty late collisions&quot;</li>
</ul>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 01 Nov 2019 14:31:21 +0200</pubDate>
    </item>
    <item>
      <title>Small Alpine Linux containers with Java 13</title>
      <link>https://dev.l1x.be/posts/2019/04/24/small-alpine-linux-containers-with-java-13/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2019/04/24/small-alpine-linux-containers-with-java-13/</guid>
      <content:encoded><![CDATA[<h2 id="intro"><a href="#intro">Intro</a></h2>
<p>Quite often I hear a complaint from developers that Java containers are too big and how much smaller this would be with Go or other languages. With this new project called <a href="https://openjdk.java.net/projects/portola/">Portola</a> it is possible to make very small (~40MB) containers running Java applications. Alpine Linux became the de facto standard for small containers but until now it was a rather complex process to create a Java environment using it. This is not anymore the case. Let's see how we can leverage Project Portola to create these small containers.</p>
<h2 id="creating-containers"><a href="#creating-containers">Creating Containers</a></h2>
<p>First, we just create a container that has the new small size JDK.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM alpine:latest as build
</div><div class="line" data-line="2">
</div><div class="line" data-line="3">ADD https://download.java.net/java/early_access/alpine/16/binaries/openjdk-13-ea+16_linux-x64-musl_bin.tar.gz /opt/jdk/
</div><div class="line" data-line="4">RUN tar -xzvf /opt/jdk/openjdk-13-ea+16_linux-x64-musl_bin.tar.gz -C /opt/jdk/
</div><div class="line" data-line="5">
</div><div class="line" data-line="6">RUN [&quot;/opt/jdk/jdk-13/bin/jlink&quot;, &quot;--compress=2&quot;, \
</div><div class="line" data-line="7">     &quot;--module-path&quot;, &quot;/opt/jdk/jdk-13/jmods/&quot;, \
</div><div class="line" data-line="8">     &quot;--add-modules&quot;, &quot;java.base&quot;, \
</div><div class="line" data-line="9">     &quot;--output&quot;, &quot;/jlinked&quot;]
</div><div class="line" data-line="10">
</div><div class="line" data-line="11">FROM alpine:latest
</div><div class="line" data-line="12">COPY --from=build /jlinked /opt/jdk/
</div><div class="line" data-line="13">CMD [&quot;/opt/jdk/bin/java&quot;, &quot;--version&quot;]
</div></code></pre>
<p>We can start to build the container:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">v</span><span style="color: #d2a8ff;">@alpine-java</span> <span style="color: #e6edf3;">jdk13_v</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">Sending</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">context</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">Docker</span> <span style="color: #e6edf3;">daemon</span>   <span style="color: #e6edf3;">2.56kB</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">Step</span> <span style="color: #e6edf3;">1/8</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">alpine:latest</span> <span style="color: #e6edf3;">as</span> <span style="color: #e6edf3;">build</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">latest:</span> <span style="color: #e6edf3;">Pulling</span> <span style="color: #e6edf3;">from</span> <span style="color: #e6edf3;">library/alpine</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">bdf0201b3a05:</span> <span style="color: #e6edf3;">Pull</span> <span style="color: #e6edf3;">complete</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">Digest:</span> <span style="color: #e6edf3;">sha256:28ef97b8686a0b5399129e9b763d5b7e5ff03576aa5580d6f4182a49c5fe1913</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">Status:</span> <span style="color: #e6edf3;">Downloaded</span> <span style="color: #e6edf3;">newer</span> <span style="color: #e6edf3;">image</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">alpine:latest</span>
</div><div class="line" data-line="8"> <span style="color: #e6edf3;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">cdf98d1859c1</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">Step</span> <span style="color: #e6edf3;">2/8</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">ADD</span> <span style="color: #e6edf3;">https://download.java.net/java/early_access/alpine/16/binaries/openjdk-13-ea+16_linux-x64-musl_bin.tar.gz</span> <span style="color: #e6edf3;">/opt/jdk/</span>
</div><div class="line" data-line="10"><span style="color: #e6edf3;">Downloading</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">==================================================</span><span style="color: #79c0ff;">&gt;</span><span style="color: #e6edf3;">]</span>  <span style="color: #e6edf3;">195.2MB/195.2MB</span>
</div><div class="line" data-line="11"> <span style="color: #e6edf3;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="12"> <span style="color: #e6edf3;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">b1a444e9dde9</span>
</div><div class="line" data-line="13"><span style="color: #e6edf3;">Step</span> <span style="color: #e6edf3;">3/7</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #e6edf3;">tar</span> <span style="color: #e6edf3;">-xzvf</span> <span style="color: #e6edf3;">/opt/jdk/openjdk-13-ea+16_linux-x64-musl_bin.tar.gz</span> <span style="color: #e6edf3;">-C</span> <span style="color: #e6edf3;">/opt/jdk/</span>
</div><div class="line" data-line="14"> <span style="color: #e6edf3;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="15"> <span style="color: #e6edf3;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">ce2721c75ea0</span>
</div><div class="line" data-line="16"><span style="color: #e6edf3;">Step</span> <span style="color: #e6edf3;">4/7</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">RUN</span> <span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">&quot;/opt/jdk/jdk-13/bin/jlink&quot;</span><span style="color: #a5d6ff;">,</span> <span style="color: #a5d6ff;">&quot;--compress=2&quot;</span><span style="color: #a5d6ff;">,</span>      <span style="color: #a5d6ff;">&quot;--module-path&quot;</span><span style="color: #a5d6ff;">,</span> <span style="color: #a5d6ff;">&quot;/opt/jdk/jdk-13/jmods/&quot;</span><span style="color: #a5d6ff;">,</span>      <span style="color: #a5d6ff;">&quot;--add-modules&quot;</span><span style="color: #a5d6ff;">,</span> <span style="color: #a5d6ff;">&quot;java.base&quot;</span><span style="color: #a5d6ff;">,</span>      <span style="color: #a5d6ff;">&quot;--output&quot;</span><span style="color: #a5d6ff;">,</span> <span style="color: #a5d6ff;">&quot;/jlinked&quot;</span><span style="color: #e6edf3;">]</span>
</div><div class="line" data-line="17"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="18"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">d7b2793ed509</span>
</div><div class="line" data-line="19"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">5/7</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">FROM</span> <span style="color: #e6edf3;">alpine:latest</span>
</div><div class="line" data-line="20"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">cdf98d1859c1</span>
</div><div class="line" data-line="21"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">6/7</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">COPY</span> <span style="color: #e6edf3;">--from=build</span> <span style="color: #e6edf3;">/jlinked</span> <span style="color: #e6edf3;">/opt/jdk/</span>
</div><div class="line" data-line="22"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Using</span> <span style="color: #e6edf3;">cache</span>
</div><div class="line" data-line="23"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">993fb106f2c2</span>
</div><div class="line" data-line="24"><span style="color: #d2a8ff;">Step</span> <span style="color: #e6edf3;">7/7</span> <span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">CMD</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;/opt/jdk/bin/java&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;--version&quot;</span><span style="color: #e6edf3;">]</span> <span style="color: #e6edf3;">-</span> <span style="color: #e6edf3;">to</span> <span style="color: #e6edf3;">check</span> <span style="color: #e6edf3;">JDK</span> <span style="color: #e6edf3;">version</span>
</div><div class="line" data-line="25"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">Running</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">8e1658f5f84d</span>
</div><div class="line" data-line="26"><span style="color: #d2a8ff;">Removing</span> <span style="color: #e6edf3;">intermediate</span> <span style="color: #e6edf3;">container</span> <span style="color: #e6edf3;">8e1658f5f84d</span>
</div><div class="line" data-line="27"> <span style="color: #d2a8ff;">---</span><span style="color: #79c0ff;">&gt;</span> <span style="color: #e6edf3;">350dd3a72a7d</span>
</div><div class="line" data-line="28"><span style="color: #d2a8ff;">Successfully</span> <span style="color: #e6edf3;">built</span> <span style="color: #e6edf3;">350dd3a72a7d</span>
</div></code></pre>
<p>Even though the JDK image is 195MB the build is only 41MB. We can tag the image.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">v</span><span style="color: #d2a8ff;">@alpine-java</span> <span style="color: #e6edf3;">jdk13_v</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">images</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">REPOSITORY</span>          <span style="color: #e6edf3;">TAG</span>                 <span style="color: #e6edf3;">IMAGE</span> <span style="color: #e6edf3;">ID</span>            <span style="color: #e6edf3;">CREATED</span>             <span style="color: #e6edf3;">SIZE</span>
</div><div class="line" data-line="3"><span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>              <span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>              <span style="color: #e6edf3;">350dd3a72a7d</span>        <span style="color: #79c0ff;">21</span> <span style="color: #e6edf3;">seconds</span> <span style="color: #e6edf3;">ago</span>      <span style="color: #e6edf3;">41.7MB</span>
</div><div class="line" data-line="4"><span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>              <span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>              <span style="color: #e6edf3;">d7b2793ed509</span>        <span style="color: #79c0ff;">25</span> <span style="color: #e6edf3;">minutes</span> <span style="color: #e6edf3;">ago</span>      <span style="color: #e6edf3;">565MB</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">alpine</span>              <span style="color: #e6edf3;">latest</span>              <span style="color: #e6edf3;">cdf98d1859c1</span>        <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">weeks</span> <span style="color: #e6edf3;">ago</span>         <span style="color: #e6edf3;">5.53MB</span>
</div><div class="line" data-line="6"><span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">v@alpine-java</span> <span style="color: #e6edf3;">jdk13_v</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;"></span><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">tag</span> <span style="color: #e6edf3;">350dd3a72a7d</span> <span style="color: #e6edf3;">jdk-13-musl/jdk-version:v1</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">v</span><span style="color: #d2a8ff;">@alpine-java</span> <span style="color: #e6edf3;">jdk13_v</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">images</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">REPOSITORY</span>                <span style="color: #e6edf3;">TAG</span>                 <span style="color: #e6edf3;">IMAGE</span> <span style="color: #e6edf3;">ID</span>            <span style="color: #e6edf3;">CREATED</span>              <span style="color: #e6edf3;">SIZE</span>
</div><div class="line" data-line="9"><span style="color: #e6edf3;">jdk-13-musl/jdk-version</span>   <span style="color: #e6edf3;">v1</span>                  <span style="color: #e6edf3;">350dd3a72a7d</span>        <span style="color: #e6edf3;">About</span> <span style="color: #e6edf3;">a</span> <span style="color: #e6edf3;">minute</span> <span style="color: #e6edf3;">ago</span>   <span style="color: #e6edf3;">41.7MB</span>
</div><div class="line" data-line="10"><span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>                    <span style="color: #79c0ff;">&lt;</span><span style="color: #e6edf3;">none</span><span style="color: #79c0ff;">&gt;</span>              <span style="color: #e6edf3;">d7b2793ed509</span>        <span style="color: #79c0ff;">27</span> <span style="color: #e6edf3;">minutes</span> <span style="color: #e6edf3;">ago</span>       <span style="color: #e6edf3;">565MB</span>
</div><div class="line" data-line="11"><span style="color: #e6edf3;">alpine</span>                    <span style="color: #e6edf3;">latest</span>              <span style="color: #e6edf3;">cdf98d1859c1</span>        <span style="color: #79c0ff;">2</span> <span style="color: #e6edf3;">weeks</span> <span style="color: #e6edf3;">ago</span>          <span style="color: #e6edf3;">5.53MB</span><span style="color: #e6edf3;"></span>
</div></code></pre>
<p>Running the container:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">v</span><span style="color: #d2a8ff;">@alpine-java</span> <span style="color: #e6edf3;">jdk13_v</span><span style="color: #e6edf3;">]</span>$ <span style="color: #e6edf3;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">run</span> <span style="color: #e6edf3;">jdk-13-musl/jdk-version:v1</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">openjdk</span> <span style="color: #e6edf3;">13-ea</span> <span style="color: #e6edf3;">2019-09-17</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">OpenJDK</span> <span style="color: #e6edf3;">Runtime</span> <span style="color: #e6edf3;">Environment</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">13-ea+16</span><span style="color: #e6edf3;">)</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">OpenJDK</span> <span style="color: #e6edf3;">64-Bit</span> <span style="color: #e6edf3;">Server</span> <span style="color: #e6edf3;">VM</span> <span style="color: #e6edf3;">(</span><span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">13-ea+16,</span> <span style="color: #e6edf3;">mixed</span> <span style="color: #e6edf3;">mode</span><span style="color: #e6edf3;">)</span>
</div></code></pre>
<h2 id="building-a-helloworld-application"><a href="#building-a-helloworld-application">Building a HelloWorld application</a></h2>
<p>Now we have a base container that we can use to create one with a Java app. Lets use a simple HelloWorld.java.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">public class HelloWorld &lbrace;
</div><div class="line" data-line="2">  public static void main(String[] args) &lbrace;
</div><div class="line" data-line="3">    System.out.println(&quot;Hello, World&quot;);
</div><div class="line" data-line="4">  &rbrace;
</div><div class="line" data-line="5">&rbrace;
</div></code></pre>
<p>Compile the Java code:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">javac</span> <span style="color: #e6edf3;">HelloWorld.java</span>
</div></code></pre>
<p>Having another Dockerfile for the app container:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">FROM jdk-13-musl/jdk-version:v1
</div><div class="line" data-line="2">ADD HelloWorld.class /
</div><div class="line" data-line="3">CMD [&quot;/opt/jdk/bin/java&quot;, &quot;HelloWorld&quot;]
</div></code></pre>
<p>Building container:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">build</span> <span style="color: #e6edf3;">.</span>
</div></code></pre>
<p>After tagging we can run HelloWorld:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">sudo</span> <span style="color: #e6edf3;">docker</span> <span style="color: #e6edf3;">run</span> <span style="color: #e6edf3;">jdk-13-musl/hello-world:v1</span>
</div><div class="line" data-line="2"><span style="color: #d2a8ff;">Hello,</span> <span style="color: #e6edf3;">World</span>
</div></code></pre>
<p>The entire docker run takes around 600ms. Not bad for Java.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 24 Apr 2019 10:53:00 +0100</pubDate>
    </item>
    <item>
      <title>Creating partitions automatically in PostgreSQL</title>
      <link>https://dev.l1x.be/posts/2016/02/16/creating-partitions-automatically-in-postgresql/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2016/02/16/creating-partitions-automatically-in-postgresql/</guid>
      <content:encoded><![CDATA[<p><a id="intro"></a></p>
<h2 id="intro"><a href="#intro"><a href="#intro">Intro</a></a></h2>
<p>There are several use cases to split up tables to smaller chunks in a relational database. Our choice of SQL server is PostgreSQL the most advanced open source and free database out there for regular SQL workloads. It has decent support for partitioning data in tables but it is not automatically done. While I was working with a client it came up as a potential optimization to reduce the time it takes to run a query against a smaller portion of the data.</p>
<p><a id="use-cases"></a></p>
<h2 id="what-use-cases-benefit-from-partitioning"><a href="#what-use-cases-benefit-from-partitioning"><a href="#use-cases">What use cases benefit from partitioning?</a></a></h2>
<p>There is great coverage on the Postgres website about what benefits partitioning has.
Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.
</div><div class="line" data-line="2">When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.
</div><div class="line" data-line="3">Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. ALTER TABLE NO INHERIT and DROP TABLE are both far faster than a bulk operation. These commands also entirely avoid the VACUUM overhead caused by a bulk DELETE.
</div><div class="line" data-line="4">Seldom-used data can be migrated to cheaper and slower storage media.
</div></code></pre>
<p><a id="date-based-partitioning"></a></p>
<h2 id="implementing-daily-partitions-based-on-dates"><a href="#implementing-daily-partitions-based-on-dates"><a href="#date-based-partitioning">Implementing daily partitions based on dates</a></a></h2>
<p>First we are going to create a table with only two fields. In production there obvisously more fields but for the sake of simplicity I have trimmed down the rest.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">CREATE TABLE testing_partition(patent_id BIGINT, date DATE) WITH ( OIDS=FALSE);
</div></code></pre>
<p>There is only one thing to note here, OIDS=FALSE, that basically tells to Postgres not to assign any OIDS (object identifiers) for the rows in the newly created table. This is the default behaviour of Postgres after the 8.0 release. More about it here: link.
After creating the table we need to create a function that will be used as a trigger to create a partition if it does not exist when inserting to the table. Postgres functions are fun, you should check out what other useful things can be done with them.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">CREATE OR REPLACE FUNCTION create_partition_and_insert() RETURNS trigger AS
</div><div class="line" data-line="2">  $BODY$
</div><div class="line" data-line="3">    DECLARE
</div><div class="line" data-line="4">      partition_date TEXT;
</div><div class="line" data-line="5">      partition TEXT;
</div><div class="line" data-line="6">    BEGIN
</div><div class="line" data-line="7">      partition_date := to_char(NEW.date,&#39;YYYY_MM_DD&#39;);
</div><div class="line" data-line="8">      partition := TG_RELNAME || &#39;_&#39; || partition_date;
</div><div class="line" data-line="9">      IF NOT EXISTS(SELECT relname FROM pg_class WHERE relname=partition) THEN
</div><div class="line" data-line="10">        RAISE NOTICE &#39;A partition has been created %&#39;,partition;
</div><div class="line" data-line="11">        EXECUTE &#39;CREATE TABLE &#39; || partition || &#39; (check (date = &#39;&#39;&#39; || NEW.date || &#39;&#39;&#39;)) INHERITS (&#39; || TG_RELNAME || &#39;);&#39;;
</div><div class="line" data-line="12">      END IF;
</div><div class="line" data-line="13">      EXECUTE &#39;INSERT INTO &#39; || partition || &#39; SELECT(&#39; || TG_RELNAME || &#39; &#39; || quote_literal(NEW) || &#39;).*;&#39;;
</div><div class="line" data-line="14">      RETURN NULL;
</div><div class="line" data-line="15">    END;
</div><div class="line" data-line="16">  $BODY$
</div><div class="line" data-line="17">LANGUAGE plpgsql VOLATILE
</div><div class="line" data-line="18">COST 100;
</div></code></pre>
<p>One thing to note that this relies on the “date” field being present in the table, and that controls the name of the partition. The “date” field is date type (surprise) and we need to convert it to text so it can be used as a field name in Postgres. Luckily the to_char function does exactly that, we can give a mask how we would like to receive the string. I was choosing YYYY_MM_DD as the mask that gives us nice tables names. There is only one more thing left before we can try to insert into our new system. We need to create a trigger that runs before the actual insert happens. Creating the trigger is simple. The only important thing to note here is that it has to be before insert.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">CREATE TRIGGER testing_partition_insert_trigger
</div><div class="line" data-line="2">BEFORE INSERT ON testing_partition
</div><div class="line" data-line="3">FOR EACH ROW EXECUTE PROCEDURE create_partition_and_insert();
</div></code></pre>
<p><a id="testing-partitioning"></a></p>
<h2 id="testing-partitioning"><a href="#testing-partitioning"><a href="#date-based-partitioning">Testing partitioning</a></a></h2>
<p>Now we have everything in place for testing partitioning. Lets execute few INSERT statements to see it works as expected.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">=&gt; insert into testing_partition values (12312, &#39;2011-01-11&#39;);
</div><div class="line" data-line="2">NOTICE:  A partition has been created testing_partition_2011_01_11
</div><div class="line" data-line="3">INSERT 0 0
</div><div class="line" data-line="4">=&gt;
</div><div class="line" data-line="5">=&gt; insert into testing_partition values (1, &#39;2011-01-11&#39;);
</div><div class="line" data-line="6">INSERT 0 0
</div></code></pre>
<p>One minor problem you might notice is that the function does not return how many rows were inserted into the table. Other than that it seems everything is working.</p>
<p><a id="checking-partitions"></a></p>
<h2 id="checking-partitions"><a href="#checking-partitions"><a href="#checking-partitions">Checking Partitions</a></a></h2>
<p>We have few partitions in our setup but there is no good way to check how many exactly there. For checking on our partitions we can craft a simple query and roll it into a view for easier execution.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">CREATE VIEW show_partitions AS
</div><div class="line" data-line="2">SELECT nmsp_parent.nspname AS parent_schema,
</div><div class="line" data-line="3">       parent.relname AS parent,
</div><div class="line" data-line="4">       nmsp_child.nspname AS child_schema,
</div><div class="line" data-line="5">       child.relname AS child
</div><div class="line" data-line="6">FROM pg_inherits
</div><div class="line" data-line="7">JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
</div><div class="line" data-line="8">JOIN pg_class child ON pg_inherits.inhrelid = child.oid
</div><div class="line" data-line="9">JOIN pg_namespace nmsp_parent ON nmsp_parent.oid = parent.relnamespace
</div><div class="line" data-line="10">JOIN pg_namespace nmsp_child ON nmsp_child.oid = child.relnamespace
</div><div class="line" data-line="11">WHERE parent.relname=&#39;testing_partition&#39; ;
</div></code></pre>
<p>Lets select all of the partitions we got for the table so far:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">=&gt; select * from show_partitions;
</div><div class="line" data-line="2"> parent_schema |      parent       | child_schema |            child
</div><div class="line" data-line="3">---------------+-------------------+--------------+------------------------------
</div><div class="line" data-line="4"> public        | testing_partition | public       | testing_partition_2019_01_11
</div><div class="line" data-line="5"> public        | testing_partition | public       | testing_partition_2018_01_11
</div><div class="line" data-line="6"> public        | testing_partition | public       | testing_partition_2011_01_11
</div><div class="line" data-line="7">(3 rows)
</div></code></pre>
<p>Perfect, now we have a good start to use our new setup with automatic partition creation. Few open questions left on the table:</p>
<ul>
<li>h̶o̶w̶ ̶t̶o̶ ̶r̶e̶t̶u̶r̶n̶ ̶t̶h̶e̶ ̶c̶o̶r̶r̶e̶c̶t̶ ̶n̶u̶m̶b̶e̶r̶ ̶i̶n̶s̶e̶r̶t̶e̶d̶ ̶t̶o̶ ̶t̶h̶e̶ ̶t̶a̶b̶l̶e̶to</li>
<li>h̶o̶w̶ ̶t̶o̶ ̶r̶e̶t̶u̶r̶n̶ ̶t̶h̶e̶ ̶n̶e̶w̶l̶y̶ ̶c̶r̶e̶a̶t̶e̶d̶ ̶i̶d̶ ̶f̶r̶o̶m̶ ̶t̶h̶e̶ ̶f̶u̶n̶c̶t̶i̶o̶n̶</li>
</ul>
<p>Update I:
Changing the INSERT statement to include patent_id when returns:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">CREATE OR REPLACE FUNCTION create_partition_and_insert() RETURNS trigger AS
</div><div class="line" data-line="2">  $BODY$
</div><div class="line" data-line="3">    DECLARE
</div><div class="line" data-line="4">      partition_date TEXT;
</div><div class="line" data-line="5">      partition TEXT;
</div><div class="line" data-line="6">    BEGIN
</div><div class="line" data-line="7">      partition_date := to_char(NEW.date,&#39;YYYY_MM_DD&#39;);
</div><div class="line" data-line="8">      partition := TG_RELNAME || &#39;_&#39; || partition_date;
</div><div class="line" data-line="9">      IF NOT EXISTS(SELECT relname FROM pg_class WHERE relname=partition) THEN
</div><div class="line" data-line="10">        RAISE NOTICE &#39;A partition has been created %&#39;,partition;
</div><div class="line" data-line="11">        EXECUTE &#39;CREATE TABLE &#39; || partition || &#39; (check (date = &#39;&#39;&#39; || NEW.date || &#39;&#39;&#39;)) INHERITS (&#39; || TG_RELNAME || &#39;);&#39;;
</div><div class="line" data-line="12">      END IF;
</div><div class="line" data-line="13">      EXECUTE &#39;INSERT INTO &#39; || partition || &#39; SELECT(&#39; || TG_RELNAME || &#39; &#39; || quote_literal(NEW) || &#39;).* RETURNING patent_id;&#39;;
</div><div class="line" data-line="14">      RETURN NULL;
</div><div class="line" data-line="15">    END;
</div><div class="line" data-line="16">  $BODY$
</div><div class="line" data-line="17">LANGUAGE plpgsql VOLATILE
</div><div class="line" data-line="18">COST 100;
</div></code></pre>
<p>We can add the same to the actual insert we are issuing.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1"># insert into testing_partition values (1, &#39;2011-01-11&#39;) returning patent_id ;
</div><div class="line" data-line="2"> patent_id
</div><div class="line" data-line="3">-----------
</div><div class="line" data-line="4">         1
</div><div class="line" data-line="5">(1 row)
</div></code></pre>
<p>I am going to update this post when I figure out these things. Thanks for reading!</p>
<p>UPDATE I: Figured out finally how to return the ids, updates above.</p>
<p>I have received some negtive feedback for writing this. I think there is an official Postgres solution for this problem.</p>
<p><a href="https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL_Partitions.html">https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL_Partitions.html</a></p>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Tue, 16 Feb 2016 14:32:04 +0100</pubDate>
    </item>
    <item>
      <title>Converting Amazon S3 logs to Avro</title>
      <link>https://dev.l1x.be/posts/2016/02/03/converting-amazon-s3-logs-to-avro/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2016/02/03/converting-amazon-s3-logs-to-avro/</guid>
      <content:encoded><![CDATA[<h2 id="amazon-s3-website-hosting"><a href="#amazon-s3-website-hosting">Amazon S3 Website Hosting</a></h2>
<p>Amazon S3 is an excellent resource for hosting static websites (html, css, js) because it provides free SSL certs for free and fast content delivery network as well for reasonable pricing. Hosting websites is trivial and well documented. After setting up all these we have a running website using SSL with geographically distributed edge caches for faster page load.</p>
<p>S3 provides access logging for tracking requests to your bucket. Each access log entry (called the record) has information about a single request, including requester, request time, response status, bucket, key, etc. The actual format is described in this document, explaining each field in depth. </p>
<p>Example entry:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be mybucket [06/Feb/2014:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - &quot;GET /mybucket?versioning HTTP/1.1&quot; 200 - 113 - 7 - &quot;-&quot; &quot;S3Console/0.4&quot; -
</div></code></pre>
<p>The problem is with such log format that we cant access individual fields easily (without a regexp) and that we store the information a human friendly way using text. This is not optimal for storing and querying larger datasets, we need to transform it to a more space efficient solution that reduces the IO when reading a large chunk of the data on disk or using distributed analytical platforms like Hadoop.</p>
<h2 id="why-apache-avro"><a href="#why-apache-avro">Why Apache Avro</a></h2>
<p>Apache Avro has a long track record being used in production and it can be queried on Hadoop with ease.</p>
<p>According to the documentation Avro provides</p>
<ul>
<li>Rich data structures.</li>
<li>A compact, fast, binary data format.</li>
<li>A container file, to store persistent data.</li>
<li>and few other things we don’t need right now</li>
</ul>
<p>Avro also uses schemas so we can trust our data while processing it. The other alternative would be Apache ORC that is even more suitable for analytical use. I am going with Avro this time, because it is better supported than ORC in Clojure at the moment.</p>
<h2 id="why-clojure"><a href="#why-clojure">Why Clojure</a></h2>
<p>My personal reasons why I am using Clojure for data projects like this is: </p>
<p>quick prototyping (REPL)
support for asynchronous programming (link)
small code base, less verbose than Java yet more readable
access to all of the Java libraries</p>
<p>Most of the data services I am working with on a daily basis has decent Java support, that means I just as easily use those libraries in Clojure. I also like small nice things. :)</p>
<h2 id="getting-started"><a href="#getting-started">Getting Started</a></h2>
<p>Just to summarise what are trying to achieve with this project and article series:</p>
<ul>
<li>covering reading text files from Amazon S3 and convert the data to Avro (part I)</li>
<li>explaining how to convert a single thread execution to an asynchronous one with core.async (part II)</li>
<li>build a simple DSL to query Avro files (part III)</li>
</ul>
<p>For starting I am going through the major topics involved in the process, how to use AWS S3 api, how to create Avro files and finally how to process lines of log files.</p>
<h2 id="talking-to-s3"><a href="#talking-to-s3">Talking to S3</a></h2>
<p>After some initial poking around with the libraries we need for this I decided to use the raw Java S3 api, since it is so well written, using it in Clojure is a breeze.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn create-basic-aws-credentials
</div><div class="line" data-line="2"> “Takes a hashmap with AWS security credentials and creates a BasicAWSCredentials”
</div><div class="line" data-line="3"> ^BasicAWSCredentials [^PersistentArrayMap credentials]
</div><div class="line" data-line="4"> ;guard function with both keys checked if present
</div><div class="line" data-line="5"> (BasicAWSCredentials.
</div><div class="line" data-line="6"> (:aws_access_key_id credentials)
</div><div class="line" data-line="7"> (:aws_secret_access_key credentials)))
</div><div class="line" data-line="8">
</div><div class="line" data-line="9">(defn connect-with-basic-credentials
</div><div class="line" data-line="10"> “Connecting to S3 only with credentials”
</div><div class="line" data-line="11"> ^AmazonS3Client [^BasicAWSCredentials basic-aws-credentials]
</div><div class="line" data-line="12"> (AmazonS3Client. basic-aws-credentials))
</div></code></pre>
<p>Creating a credential and using it to create an AmazonS3Client is simple. We can use many S3 clients at the same time for better performance but for the initial version we are going to stick to a single connection.</p>
<p>Log files are organised around dates, keeping one file per day sounds reasonable. Each day has zero or many entries, where many is less than a 10.000 so there is no need for splitting up a day for smaller chunks. On average there are 1000–2000 files per day, depending on the number of access entries. We are going to process data day by day, using a moving window. The size of the window and when is starts can be configured in the config.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">:days &lbrace;
</div><div class="line" data-line="2"> :start 2 ; starts the processing x days ago
</div><div class="line" data-line="3"> :stop 12 ; stops the processing y days ago, processing 10 days worth of data this way
</div><div class="line" data-line="4"> &rbrace;
</div></code></pre>
<p>Using the example from the config and yields to the following list of dates:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(&quot;2016–01–20&quot; &quot;2016–01–21&quot; &quot;2016–01–22&quot; &quot;2016–01–23&quot;
</div><div class="line" data-line="2">  &quot;2016–01–24&quot; &quot;2016–01–25&quot; &quot;2016–01–26&quot; &quot;2016–01–27&quot;
</div><div class="line" data-line="3">  &quot;2016–01–28&quot; &quot;2016–01–29&quot; &quot;2016–01–30&quot; &quot;2016–01–31&quot;)
</div></code></pre>
<p>Fetching actual file names for each day can be tricky at the first sight but we can use the truncated field for checking if there are more than 1000 (by default) files for the particular day.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn list-all-files-eager-blocking
</div><div class="line" data-line="2">  &quot;Returns a sequence of the items in a bucket or bucket/folder  &quot;
</div><div class="line" data-line="3">  [ ^AmazonS3Client amazon-s3-client ^String bucket-name ^String prefix
</div><div class="line" data-line="4">    ^String marker ^String delimiter ^Integer max-keys ^PersistentList acc]
</div><div class="line" data-line="5">  (log/debug bucket-name acc)
</div><div class="line" data-line="6">  (let [  ^ListObjectsRequest list-object-request (create-list-object-request
</div><div class="line" data-line="7">                                                    bucket-name
</div><div class="line" data-line="8">                                                    prefix marker
</div><div class="line" data-line="9">                                                    delimiter
</div><div class="line" data-line="10">                                                    max-keys)
</div><div class="line" data-line="11">          ^ObjectListing object-listing  (list-objects amazon-s3-client list-object-request)]
</div><div class="line" data-line="12">    (if-not (is-truncated? object-listing)
</div><div class="line" data-line="13">      ; return
</div><div class="line" data-line="14">      (flatten (concat acc (map get-s3-object-summary-clj
</div><div class="line" data-line="15">                      (get-object-summaries object-listing))))
</div><div class="line" data-line="16">      ; recur with the new request                           ;
</div><div class="line" data-line="17">      (recur  amazon-s3-client
</div><div class="line" data-line="18">              bucket-name
</div><div class="line" data-line="19">              prefix
</div><div class="line" data-line="20">              (get-next-marker object-listing)
</div><div class="line" data-line="21">              delimiter
</div><div class="line" data-line="22">              max-keys
</div><div class="line" data-line="23">              (conj acc (map get-s3-object-summary-clj
</div><div class="line" data-line="24">                          (get-object-summaries object-listing)))))))
</div></code></pre>
<p>This function is blocking so it won’t return until all of the items are fetched, it is not recommended to process 100.000+ files at the same time. For processing that many files we need to re-write it to be lazy producing a lazy sequence where the items are looked up when needed. (Added to the TODO). The function that returns the Clojure representation (a hash-map) of a log entry is get-s3-object-summary-clj.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn get-s3-object-summary-clj
</div><div class="line" data-line="2">  &quot;Returns a Clojure representation of a S3ObjectSummary&quot;
</div><div class="line" data-line="3">  [^S3ObjectSummary s3-object-summary]
</div><div class="line" data-line="4">  &lbrace; :bucket-name    (.getBucketName     s3-object-summary)
</div><div class="line" data-line="5">    :e-tag          (.getETag           s3-object-summary)
</div><div class="line" data-line="6">    :key            (.getKey            s3-object-summary)
</div><div class="line" data-line="7">    :last-modified  (.getLastModified   s3-object-summary)
</div><div class="line" data-line="8">    :owner          (.getOwner          s3-object-summary)
</div><div class="line" data-line="9">    :size           (.getSize           s3-object-summary)
</div><div class="line" data-line="10">    :storage-class  (.getStorageClass   s3-object-summary) &rbrace;)
</div></code></pre>
<p>This way are have a list of entires that we are going to process later. For booting up all this in REPL we can use the following few lines assuming the configuration is correct and the credential file is present and it has valid access and secret key.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(def config                           (cli/process-config &quot;conf/app.edn&quot;))
</div><div class="line" data-line="2">(def credentials                      (:ok (get-credentials (get-in config [:ok :aws :credentials-file]))))
</div><div class="line" data-line="3">(def bucket                           (name (get-in config [:ok :aws :s3 :bucket])))
</div><div class="line" data-line="4">(def aws-basic-cred                   (s4/create-basic-aws-credentials credentials))
</div><div class="line" data-line="5">(def aws-s3-connection                (s4/connect-with-basic-credentials aws-basic-cred))
</div><div class="line" data-line="6">(def processing-days                  (days 2 12)
</div><div class="line" data-line="7">(def s3-log-pattern                   (re-pattern (get-in config [:ok :aws :log-format])))
</div><div class="line" data-line="8">(def all-files-for-a-day              (s4/list-all-files-eager aws-s3-connection bucket (str &quot;logs/&quot; &quot;2015-12-25-10&quot;) &quot;&quot; &quot;&quot; (int 1000) ()))
</div><div class="line" data-line="9">(def first-entry                      (first all-files-for-a-day)
</div><div class="line" data-line="10">(def object-content                   (s4/get-object-content-safe (s4/get-object aws-s3-connection bucket &quot;logs/2015-12-25-00-23-17-8AC95FEBE0374F7B&quot;)))
</div><div class="line" data-line="11">(def schema-file                      &quot;schema/amazon-log.avsc&quot;)
</div><div class="line" data-line="12">(def s3-log-avro-schema-json          (json/parse-stream (io/reader schema-file)))
</div><div class="line" data-line="13">(def s3-log-avro-schema-fields        (get-avro-schema-fields s3-log-avro-schema-json))
</div><div class="line" data-line="14">(def s3-log-avro-schema-fields-dash   (replace-field-names s3-log-avro-schema-fields &quot;_&quot; &quot;-&quot;))
</div><div class="line" data-line="15">(def s3-log-avro-schema               (avro-schema schema-file))
</div><div class="line" data-line="16">(def int-fields                        #&lbrace;:turn-around-time :http-status :total-time :bytes-sent :object-size&rbrace;)
</div></code></pre>
<p>After got connected to S3 we can play with the log files. Checking the first entry (calling first on all-files-for-a-day):</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">&lbrace;:bucket-name &quot;www.streambrightdata.com&quot;, :e-tag &quot;a4092cf1a282c3fb7d027ee56e4155d4&quot;,
</div><div class="line" data-line="2"> :key &quot;logs/2016-02-01-10-22-42-07BA495229374DB4&quot;, :last-modified #inst &quot;2016-02-01T10:22:43.000-00:00&quot;,
</div><div class="line" data-line="3"> :owner #object[com.amazonaws.services.s3.model.Owner 0x77314d5c &quot;S3Owner [name=s3-log-service,id=3272e1]&quot;],
</div><div class="line" data-line="4"> :size 398, :storage-class &quot;STANDARD&quot;&rbrace;
</div></code></pre>
<p>Since Clojure keywords can be used as functions we can easily list all of the file names in the list we produced earlier.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">;s3-logrotate.core=&gt; (map :key all-files-for-a-day)
</div><div class="line" data-line="2">(&quot;logs/2016-02-01-10-22-42-07BA495229374DB4&quot; &quot;logs/2016-02-01-10-22-50-BD7DAC7BF88CBDC8&quot;
</div><div class="line" data-line="3">&quot;logs/2016-02-01-10-22-55-296AE2E3EBCDD3B6&quot; &quot;logs/2016-02-01-10-22-57-5853C8B48DC9163A&quot;
</div><div class="line" data-line="4">&quot;logs/2016-02-01-10-23-07-EDBAAA82DFF3B039&quot; &quot;logs/2016-02-01-10-23-07-F8325A25289E1015&quot;
</div><div class="line" data-line="5">&quot;logs/2016-02-01-10-23-11-E949D06BEFE68356&quot; &quot;logs/2016-02-01-10-23-14-90FFF938F152B30A&quot;
</div><div class="line" data-line="6">&quot;logs/2016-02-01-10-25-40-465E2F36B71741F9&quot; &quot;logs/2016-02-01-10-31-03-CA1510898F68F9FF&quot;
</div><div class="line" data-line="7">&quot;logs/2016-02-01-10-31-16-6B7F2165642094E5&quot; &quot;logs/2016-02-01-10-31-17-64F85764086EB154&quot;
</div><div class="line" data-line="8">&quot;logs/2016-02-01-10-39-57-E7C3302E2ECBAF51&quot; &quot;logs/2016-02-01-10-40-08-EB8A7B1695CECB67&quot;
</div><div class="line" data-line="9">&quot;logs/2016-02-01-10-40-13-A79F7E8E40B5151B&quot; &quot;logs/2016-02-01-10-40-56-F18AA7085783DF53&quot;
</div><div class="line" data-line="10">&quot;logs/2016-02-01-10-40-57-3B568EBF4995F2A5&quot; &quot;logs/2016-02-01-10-41-14-27A07093139561C9&quot;
</div><div class="line" data-line="11">&quot;logs/2016-02-01-10-41-29-B08373BAFE62149E&quot; &quot;logs/2016-02-01-10-41-44-DC32EBA17CF8604F&quot; )
</div></code></pre>
<h2 id="processing-a-single-line"><a href="#processing-a-single-line">Processing a single line</a></h2>
<p>Unfortunately there is no better way of processing these lines than using a regular expression.</p>
<p>I guess it is not nice but at least gets the job done. I still need to run it on bigger data sets but for our use case it works. When there are parenthesized groups in the pattern and re-find finds a match, it returns a vector. The first element is the matching string, the remaining elements are the individual groups. In this case we need to pay attention not only that Amazon uses “-” for null values but also to match all of the possible values of the referer and user agent fields.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">#&quot;(\S+)
</div><div class="line" data-line="2">([a-z0-9][a-z0-9-.]+)
</div><div class="line" data-line="3">\[(.*\+.*)\]
</div><div class="line" data-line="4">(\b(?:\d&lbrace;1,3&rbrace;\.)&lbrace;3&rbrace;\d&lbrace;1,3&rbrace;\b)
</div><div class="line" data-line="5">(\S+)
</div><div class="line" data-line="6">(\S+)
</div><div class="line" data-line="7">(\S+)
</div><div class="line" data-line="8">(\S+)
</div><div class="line" data-line="9">\&quot;(\w+\ \S+ \S+)\&quot;
</div><div class="line" data-line="10">(\d+|\-)
</div><div class="line" data-line="11">(\S+)
</div><div class="line" data-line="12">(\d+|\-)
</div><div class="line" data-line="13">(\d+|\-)
</div><div class="line" data-line="14">(\d+|\-)
</div><div class="line" data-line="15">(\d+|\-)
</div><div class="line" data-line="16">\&quot;(https?\:\/\/.*\/?|\-)\&quot;
</div><div class="line" data-line="17">\&quot;(.*)\&quot;
</div><div class="line" data-line="18">(\S+)&quot;
</div></code></pre>
<p>This works reasonably well, I haven’t found a non matching long entry yet. Now we can extend the s3api with get object content capabilities, that is required for downloading an object from S3.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn get-object-content-unsafe
</div><div class="line" data-line="2">  &quot;Gets the input stream containing the contents of this object.
</div><div class="line" data-line="3">  This function returns an InputStream, holding onto it result in resource pool
</div><div class="line" data-line="4">  exhaustion&quot;
</div><div class="line" data-line="5">  ^S3ObjectInputStream [^S3Object object]
</div><div class="line" data-line="6">  (.getObjectContent object))
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">(defn close-object
</div><div class="line" data-line="9">  &quot;Closes object&quot;
</div><div class="line" data-line="10">  [^S3Object object]
</div><div class="line" data-line="11">  (.close object))
</div><div class="line" data-line="12">
</div><div class="line" data-line="13">(defn get-object-content-safe
</div><div class="line" data-line="14">  &quot;&quot;
</div><div class="line" data-line="15">  [^S3Object object]
</div><div class="line" data-line="16">  (let [
</div><div class="line" data-line="17">          ^PersistentVector return  (with-open
</div><div class="line" data-line="18">                                      [rdr (io/reader (get-object-content-unsafe object))]
</div><div class="line" data-line="19">                                       (reduce conj () (line-seq rdr)))
</div><div class="line" data-line="20">                            _       (close-object object) ]
</div><div class="line" data-line="21">    ; returning
</div><div class="line" data-line="22">    return))
</div></code></pre>
<p>Creating an S3Object is easy, we just need to supply a connection, a bucket and a key.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn get-object
</div><div class="line" data-line="2">  &quot;Gets S3 object&quot;
</div><div class="line" data-line="3">  [^AmazonS3Client amazon-s3-client ^String bucket-name ^String s3-key]
</div><div class="line" data-line="4">  (.getObject amazon-s3-client bucket-name s3-key))
</div></code></pre>
<p>Now that we have means to talk to S3 and read files from it we could move on to have a closer look to Avro files and how to write them in Clojure.</p>
<h2 id="working-with-apache-avro-in-clojure"><a href="#working-with-apache-avro-in-clojure">Working with Apache Avro in Clojure</a></h2>
<p>Luckily there is a good library that we can use to work with Avro files in Clojure, so we don’t need to re-invent the hot water this time. Abracad provides serialization and deserialization for Clojure data structures with Avro that can be persisted to disk or used in message passing systems like Kafka for example. We are going to persist the data to disk this time.</p>
<p>Before we can write any Avro entry to disk we need a schema for the data that we are collecting here. There are some challenges coming up with the right schema but we can jump these hoops.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="2">  <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;record&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="3">  <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;amazon-log&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="4">  <span style="color: #79c0ff;">&quot;namespace&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;com.streambright.avro&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">  <span style="color: #79c0ff;">&quot;fields&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="6">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;bucket_owner&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="7">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The canonical user ID of the owner of the source bucket.&quot;</span>
</div><div class="line" data-line="9">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="10">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;bucket&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="11">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="12">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The name of the bucket that the request was processed against.&quot;</span>
</div><div class="line" data-line="13">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="14">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;time&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="15">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="16">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The time at which the request was received. The format, using strftime() terminology, is as follows: [%d/%b/%Y:%H:%M:%S %z]&quot;</span>
</div><div class="line" data-line="17">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="18">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;remote_ip&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="19">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="20">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The apparent Internet address of the requester. Intermediate proxies and firewalls might obscure the actual address of the machine making the request.&quot;</span>
</div><div class="line" data-line="21">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="22">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;requester&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="23">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;null&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="24">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The canonical user ID of the requester, or the string Anonymous  for unauthenticated requests.&quot;</span>
</div><div class="line" data-line="25">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="26">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;request_id&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="27">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="28">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The request ID is a string generated by Amazon S3 to uniquely identify each request.&quot;</span>
</div><div class="line" data-line="29">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="30">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;operation&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="31">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="32">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The operation listed here is declared as SOAP.operation, REST.HTTP_method.resource_type, WEBSITE.HTTP_method.resource_type, or BATCH.DELETE.OBJECT.&quot;</span>
</div><div class="line" data-line="33">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="34">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;key&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="35">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;null&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="36">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The key part of the request, URL encoded, or  -  if the operation does not take a key parameter.&quot;</span>
</div><div class="line" data-line="37">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="38">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;request_uri&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="39">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="40">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The Request-URI part of the HTTP request message.&quot;</span>
</div><div class="line" data-line="41">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="42">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;http_status&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="43">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;int&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="44">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The numeric HTTP status code of the response.&quot;</span>
</div><div class="line" data-line="45">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="46">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;error_code&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="47">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;null&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="48">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The request ID is a string generated by Amazon S3 to uniquely identify each request.&quot;</span>
</div><div class="line" data-line="49">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="50">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;bytes_sent&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="51">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;int&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="52">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The number of response bytes sent, excluding HTTP protocol overhead, or  -  if zero.&quot;</span>
</div><div class="line" data-line="53">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="54">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;object_size&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="55">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;int&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="56">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The total size of the object in question.&quot;</span>
</div><div class="line" data-line="57">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="58">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;total_time&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="59">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;int&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="60">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The number of milliseconds the request was in flight from the server&#39;s perspective. This value is measured from the time your request is received to the time that the last byte of the response is sent.&quot;</span>
</div><div class="line" data-line="61">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="62">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;turn_around_time&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="63">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;int&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="64">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The number of milliseconds that Amazon S3 spent processing your request. This value is measured from the time the last byte of your request was received until the time the first byte of the response was sent.&quot;</span>
</div><div class="line" data-line="65">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="66">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;referrer&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="67">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;null&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="68">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The value of the HTTP Referrer header, if present. HTTP user-agents (e.g. browsers) typically set this header to the URL of the linking or embedding page when making a request.&quot;</span>
</div><div class="line" data-line="69">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="70">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;user_agent&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="71">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;null&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="72">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The value of the HTTP User-Agent header.&quot;</span>
</div><div class="line" data-line="73">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="74">      <span style="color: #79c0ff;">&quot;name&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;version_id&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="75">      <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">[</span><span style="color: #a5d6ff;">&quot;null&quot;</span><span style="color: #e6edf3;">,</span> <span style="color: #a5d6ff;">&quot;string&quot;</span><span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="76">      <span style="color: #79c0ff;">&quot;doc&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;The version ID in the request, or - if the operation does not take a versionId parameter.&quot;</span>
</div><div class="line" data-line="77">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="78">  <span style="color: #e6edf3;">]</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="79">  <span style="color: #79c0ff;">&quot;doc:&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Schema for Amazon log format - http://docs.aws.amazon.com/AmazonS3/latest/dev/LogFormat.html&quot;</span>
</div><div class="line" data-line="80"><span style="color: #e6edf3;">&rbrace;</span>
</div></code></pre>
<p>First and foremost Avro does not let “-” to be used as the field separator in Avro schemas we have to use “_” instead. Since Amazon allows null values for certain fields we need to reflect that in our schema. Defining nullable fields is easy, we just use an array of possible types for the type as in the example above. There is no strong support for dates yet, we are going to store it as string for now. After we generated the schema we can use it to create an Avro file.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(def schema-file                      &quot;schema/amazon-log.avsc&quot;)
</div><div class="line" data-line="2">(def s3-log-avro-schema-json          (json/parse-stream (io/reader schema-file)))
</div><div class="line" data-line="3">(def s3-log-avro-schema-fields        (get-avro-schema-fields s3-log-avro-schema-json))
</div><div class="line" data-line="4">(def s3-log-avro-schema-fields-dash   (replace-field-names s3-log-avro-schema-fields &quot;_&quot; &quot;-&quot;))
</div><div class="line" data-line="5">(def s3-log-avro-schema               (avro-schema schema-file))
</div><div class="line" data-line="6">(def int-fields                        #&lbrace;:turn-around-time :http-status :total-time :bytes-sent :object-size&rbrace;)
</div><div class="line" data-line="7">
</div><div class="line" data-line="8">s3-logrotate.core=&gt; (def avr-file (avro/data-file-writer &quot;deflate&quot; s3-log-avro-schema (str &quot;data/&quot; &quot;tt.avro&quot;)))
</div><div class="line" data-line="9">#&#39;s3-logrotate.core/avr-file
</div><div class="line" data-line="10">s3-logrotate.core=&gt; (for [_ (range 10)] (.append avr-file &#39;&lbrace;
</div><div class="line" data-line="11">:request-uri &quot;PUT /www.streambrightdata.com/logs/2015-12-08-04-42-35-C1C3217A278399FA HTTP/1.1&quot;,
</div><div class="line" data-line="12">:request-id &quot;BDE6E681EDC7FDB0&quot;, :user-agent &quot;aws-internal/3&quot;, :remote-ip &quot;10.194.229.49&quot;,
</div><div class="line" data-line="13">:key &quot;logs/2015-12-08-04-42-35-C1C3217A278399FA&quot;, :version-id nil, :time &quot;08/Dec/2015:04:42:35 +0000&quot;,
</div><div class="line" data-line="14">:operation &quot;REST.PUT.OBJECT&quot;, :object-size 398, :error-code nil, :bytes-sent 0, :referrer nil,
</div><div class="line" data-line="15">:requester &quot;3272ee65a9&quot;, :http-status 200, :turn-around-time 20, :total-time 36,
</div><div class="line" data-line="16">:bucket &quot;www.streambrightdata.com&quot;, :bucket-owner &quot;f2b98d9dd4d&quot;
</div><div class="line" data-line="17">&rbrace;))
</div><div class="line" data-line="18">s3-logrotate.core=&gt; (.close avr-file)
</div><div class="line" data-line="19">nil
</div><div class="line" data-line="20">s3-logrotate.core=&gt; (count (with-open [adf (avro/data-file-reader &quot;data/tt.avro&quot;)] (doall (seq adf))))
</div><div class="line" data-line="21">10
</div></code></pre>
<h2 id="summarising"><a href="#summarising">Summarising</a></h2>
<p>As you can see Clojure provides pretty good tools to work with Amazon S3 and Avro files. The codebase is pretty small (559 LOC) and it already does a lot. In the next articles in the series I am going to make it asynchronous and faster with core.async (channels) and finish the code to upload the converted files to S3 afterwards. Even though I have processed two months worth of log with s3-logrotate already and it works reasonably well it is just a prototype this stage. I am going to improve it during the upcoming months.</p>
<p>The full code is available here check it out:</p>
<p><a href="https://github.com/StreamBright/s3-logrotate"><a href="https://github.com/StreamBright/s3-logrotate">https://github.com/StreamBright/s3-logrotate</a></a></p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 03 Feb 2016 21:32:28 +0100</pubDate>
    </item>
    <item>
      <title>High Performance Kafka Producer on AWS</title>
      <link>https://dev.l1x.be/posts/2015/03/02/high-performance-kafka-producer-on-aws/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2015/03/02/high-performance-kafka-producer-on-aws/</guid>
      <description>&lt;p&gt;The modern analytics stack uses some sort low latency data bus to power both real-time and batch pipelines. In this post we are going to take a closer look to Apache Kafka, that can handle billions of events every hour on few nodes using commodity hardware. It tolerates outages on both the producer and the consumer side, or even in the service (a Kafka broker goes down). All of these make it a great fit for analytics especially for real-time, streaming systems. The new re-written library &lt;a href=&quot;http://mvnrepository.com/artifact/org.apache.kafka/kafka-clients/0.8.2.1&quot;&gt;kafka-clients&lt;/a&gt; jut got released, containing both the producer and the consumer client but the latter is not implemented yet. Let&apos;s have a look to the new configuration and the code.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="context"><a href="#context">Context</a></h2>
<p>The modern analytics stack uses some sort low latency data bus to power both real-time and batch pipelines. In this post we are going to take a closer look to Apache Kafka, that can handle billions of events every hour on few nodes using commodity hardware. It tolerates outages on both the producer and the consumer side, or even in the service (a Kafka broker goes down). All of these make it a great fit for analytics especially for real-time, streaming systems. The new re-written library <a href="http://mvnrepository.com/artifact/org.apache.kafka/kafka-clients/0.8.2.1">kafka-clients</a> jut got released, containing both the producer and the consumer client but the latter is not implemented yet. Let's have a look to the new configuration and the code.</p>
<h3 id="kafka-producer"><a href="#kafka-producer">Kafka Producer</a></h3>
<p>Producer as the name suggests, sends data to the brokers. It can be operated in sync and async modes. The sync mode is way slower but it guarantees durability for data, while the async mode is extremely fast, but it might lose a small percentage of data in case of a node outages. The performance test will be done for both. The producer send() method has a few versions, the simplest case is just send a message, without a key. The message is a &lt;K,V&gt; object, that has the partitioning information in it.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">public final class ProducerRecord&lt;K, V&gt; &lbrace;
</div><div class="line" data-line="2">    private final String topic;
</div><div class="line" data-line="3">    private final Integer partition;
</div><div class="line" data-line="4">    private final K key;
</div><div class="line" data-line="5">    private final V value;
</div><div class="line" data-line="6">    /...
</div><div class="line" data-line="7">&rbrace;
</div></code></pre>
<p>If a valid partition number is specified that partition will be used when sending the record.
If no partition is specified but a key is present a partition will be chosen using a hash of the key.
If neither key nor partition is present a partition will be assigned in a round-robin fashion.</p>
<p><a href="https://apache.googlesource.com/kafka/+/0.8.2.1/clients/src/main/java/org/apache/kafka/clients/producer/ProducerRecord.java">source</a></p>
<p>Asynchronous mode is implemented in the producer, using <a href="http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html">Java Futures</a>. That makes it easy for both of the usecases to achieve the behavior they want consider following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">//init
</div><div class="line" data-line="2">KafkaProducer&lt;byte[], byte[]&gt; producer = new KafkaProducer&lt;byte[], byte[]&gt;(props);
</div><div class="line" data-line="3">ProducerRecord&lt;byte[], byte[]&gt; record = new ProducerRecord&lt;byte[], byte[]&gt;(topicName, payload);
</div><div class="line" data-line="4">
</div><div class="line" data-line="5">//async
</div><div class="line" data-line="6">producer.send(record);
</div><div class="line" data-line="7">//sync
</div><div class="line" data-line="8">producer.send(record).get(250, TimeUnit.MILLISECONDS);
</div></code></pre>
<p>What happens in case of a failed send can be decided in the application layer (retry, propagate error, etc.) easily. I left out the callback that can be also passed in to the send(). It comes really handy for monitoring how long a request takes or if you would like to re-try the send, maybe save it to local disk for further processing. We can extend the previous code with a simple callback</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">long sendStart = System.currentTimeMillis();
</div><div class="line" data-line="2">Callback callback = mycallback.nextCompletion(sendStart, payload.length);
</div><div class="line" data-line="3">producer.send(record, callback);
</div></code></pre>
<p>This maintains the asynchronous nature of the method but gives us an opportonity to at least gather statistics (latency, size, status) about the outcome.</p>
<h4 id="new-kafka-producer-configuration"><a href="#new-kafka-producer-configuration">New Kafka Producer Configuration</a></h4>
<p>The confgurations available with the new producere are the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">&lbrace;
</div><div class="line" data-line="2">    :bootstrap.servers                      &quot;localhost:9092&quot;
</div><div class="line" data-line="3">    :metadata.fetch.timeout.ms              &quot;1000&quot;
</div><div class="line" data-line="4">    :metadata.max.age.ms                    &quot;100000&quot;
</div><div class="line" data-line="5">    :batch.size                             &quot;256&quot;
</div><div class="line" data-line="6">    :buffer.memory                          &quot;1024000&quot; ; 1000 * 1024
</div><div class="line" data-line="7">    :acks                                   &quot;-1&quot;
</div><div class="line" data-line="8">    :timeout.ms                             &quot;250&quot;
</div><div class="line" data-line="9">    :linger.ms                              &quot;0&quot;
</div><div class="line" data-line="10">    :client.id                              &quot;test-client&quot;
</div><div class="line" data-line="11">    :send.buffer.bytes                      &quot;102400&quot; ; 100 * 1024
</div><div class="line" data-line="12">    :receive.buffer.bytes                   &quot;102400&quot; ; 100 * 1024
</div><div class="line" data-line="13">    :max.request.size                       &quot;5000000&quot;
</div><div class="line" data-line="14">    :reconnect.backoff.ms                   &quot;100&quot;
</div><div class="line" data-line="15">    :block.on.buffer.full                   &quot;true&quot;
</div><div class="line" data-line="16">    :retries                                &quot;3&quot;
</div><div class="line" data-line="17">    :retry.backoff.ms                       &quot;100&quot;
</div><div class="line" data-line="18">    :compression.type                       &quot;snappy&quot;
</div><div class="line" data-line="19">    :metrics.sample.window.ms               &quot;&quot;
</div><div class="line" data-line="20">    :metrics.num.samples                    &quot;&quot;
</div><div class="line" data-line="21">    :metric.reporters                       &quot;&quot;
</div><div class="line" data-line="22">    :max.in.flight.requests.per.connection  &quot;1024&quot;
</div><div class="line" data-line="23">    :key.serializer                         &quot;org.apache.kafka.common.serialization.ByteArraySerializer&quot;
</div><div class="line" data-line="24">    :value.serializer                       &quot;org.apache.kafka.common.serialization.ByteArraySerializer&quot;
</div><div class="line" data-line="25"> &rbrace;
</div></code></pre>
<p>The configuration has to be tailored to each use case, each producer has an entirely different set of parameters depending on the role in the pipeline. The new org.apache.kafka.common namespace has the serilazation libraries String and ByteArray are the currently existing types. The entire config is much cleaner and logical than the previous version. The full description of each paramter is available <a href="https://apache.googlesource.com/kafka/+/0.8.2.1/clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java">here</a>.</p>
<h2 id="benchmarking-kafka"><a href="#benchmarking-kafka">Benchmarking Kafka</a></h2>
<p>For the benchmark we are going to use the slightly modified version of <a href="https://apache.googlesource.com/kafka/+/0.8.2.1/clients/src/main/java/org/apache/kafka/clients/tools/ProducerPerformance.java">ProducerPerformance</a>. I would like to implement the synchronous send, as above and measure the performance that way too.</p>
<h2 id="hardware"><a href="#hardware">Hardware</a></h2>
<p>I used 5 servers for Kafka (3 Zookeepers and 5 brokers) on m3.large	(vCPU:2, mem:7.5G) and a raid10 array of 4 x 512G EBS volumes. I think it is not worth the hassle to use raid arrays with EBS, but I wanted to try out just in case any client asks for it. The IO characteristics of the array are better than a single volume, but the rebuild time introduces downtime for the node pontentially of limited performance that might be worse than a complete node outage. Now, the question is, which is better: letting Kafka replicate data to different nodes that resides on a dead node, or creating a layer that does sort of similar recovery on the single node level. I was able achieve 30-40% IOWait with 3 nodes running ProducerPerformance code.</p>
<p>It requires a little more investigation what combination of nodes and EBS type would be the best, but this depends on the volume, the latency requirements and the typical message size. Often there is more than one Kafka cluster if the requirements are so different. Using the cluster for only one set of requirements is the best.</p>
<h2 id="results"><a href="#results">Results</a></h2>
<p><img src="/static/img/blog/kafka-bench-001.png" alt="Kafka Throughput" title="Kafka Througput [Single Node]" /></p>
<p><img src="/static/img/blog/kafka-bench-002.png" alt="Kafka Throughput" title="Kafka Througput [Cluster]" /></p>
<p><img src="/static/img/blog/kafka-bench-003.png" alt="Kafka Throughput" title="Kafka Througput [Cluster]" /></p>
<p><img src="/static/img/blog/kafka-bench-004.png" alt="Kafka Latency" title="Kafka Latency p50 [Cluster]" /></p>
<p><img src="/static/img/blog/kafka-bench-005.png" alt="Kafka Latency" title="Kafka Latency p50 [Cluster]" /></p>
<p>The test details can be found here, along with other test results. Parameters for ProducerPerformance are included as a header for each run.</p>
<p>[<a href="https://github.com/l1x/kafka-bench/blob/master/ec2-52-11-124-236.us-west-2.compute.amazonaws.com/log.5">https://github.com/l1x/kafka-bench/blob/master/ec2-52-11-124-236.us-west-2.compute.amazonaws.com/log.5</a>](Example log)</p>
<p>These results are just a quick peek into Kafka's performance profile, that can be tuned for the certain use case.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Mon, 02 Mar 2015 19:00:28 +0100</pubDate>
    </item>
    <item>
      <title>Simple Kafka Consumer In Clojure</title>
      <link>https://dev.l1x.be/posts/2014/05/28/simple-kafka-consumer-in-clojure/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2014/05/28/simple-kafka-consumer-in-clojure/</guid>
      <content:encoded><![CDATA[<h2 id="kafka"><a href="#kafka">Kafka</a></h2>
<p>Kafka is a high performance publish-subscribe messaging system, implemented as a distributed commit log. The message streams are organized into topics, topics are broken down furter into partitions.</p>
<p><img src="/images/log_anatomy.webp" alt="Kafka Anatomy" title="Kafka Anatomy" /></p>
<p>This system is suitable for realtime applications, using Zookeeper as the strong consistency provider. The high level APIs are pretty easy to use but often misunderstood. I have spent some time trying to figure out why the libraries out there do not work as I thought but after a while I realized it is better to write a very simple library to wrap the functionality I need.</p>
<h2 id="shovel"><a href="#shovel">Shovel</a></h2>
<p><a href="https://github.com/l1x/shovel">Shovel</a> is a minimal wrapper around the Kafka client APIs. The code is mostly documented and type hinted for better performance. Kafka by default tolerates anyservice outages on the consumer side, meaning it is going to resume the operation when the broker comes back, however the producer just simply throws a connection refused exception. This behavior enables us to consume the messages with a simple blocking stream that blocks the execution when there are no new messages or the broker is down. Lets have a closer look how the consumer works.</p>
<h3 id="consumer"><a href="#consumer">Consumer</a></h3>
<p>The Kafka consumer consists of few things. First we need to get a ConsumerConnector to connect to the broker. I am using a hashmap for the configuration and convert it to java.util.Properties to create a ConsumerConfig and ConsumerConnector that is returned.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn consumer-connector
</div><div class="line" data-line="2">  &quot;returns a ConsumerConnector that can be used to create consumer streams&quot;
</div><div class="line" data-line="3">  ^ConsumerConnector [^clojure.lang.PersistentArrayMap h]
</div><div class="line" data-line="4">  (let [config (ConsumerConfig. (hashmap-to-properties h))]
</div><div class="line" data-line="5">    (Consumer/createJavaConsumerConnector config)))
</div></code></pre>
<p>The ConsumerConnector can be used to get the message streams (java.util.ArrayList).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn message-streams
</div><div class="line" data-line="2">  &quot;returning the message-streams with a certain topic and thread-pool-size
</div><div class="line" data-line="3">  message-streams can be processed in threads with simple blocking on empty queue&quot;
</div><div class="line" data-line="4">  ^java.util.ArrayList [^ConsumerConnector consumer ^String topic ^Integer thread-pool-size]
</div><div class="line" data-line="5">  (.get
</div><div class="line" data-line="6">    (.createMessageStreams consumer &lbrace;topic thread-pool-size&rbrace;) topic))
</div></code></pre>
<p>Message streams than consumed by a simple iterator. There is the Kafka way of doing that but also the more idiomatic way in Clojure, the later is how I implemented it.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn default-iterator
</div><div class="line" data-line="2">  &quot;processing all streams in a thread and printing the message field for each message&quot;
</div><div class="line" data-line="3">  [^java.util.ArrayList streams]
</div><div class="line" data-line="4">  (let [c (async/chan)]
</div><div class="line" data-line="5">    ;; create a thread for each stream
</div><div class="line" data-line="6">    (doseq
</div><div class="line" data-line="7">      [^kafka.consumer.KafkaStream stream streams]
</div><div class="line" data-line="8">      (let [uuid (uuid)]
</div><div class="line" data-line="9">        (async/thread
</div><div class="line" data-line="10">          (async/&gt;!! c
</div><div class="line" data-line="11">            (doseq
</div><div class="line" data-line="12">              [^kafka.message.MessageAndMetadata message stream]
</div><div class="line" data-line="13">              (println (str &quot;uuid: &quot; uuid &quot; :: &quot;(String. (nth (message-to-vec message) 4)))))))))
</div><div class="line" data-line="14">    ;; read the channel forever
</div><div class="line" data-line="15">    (while true
</div><div class="line" data-line="16">      (async/&lt;!! c))))
</div></code></pre>
<p>Each message is a kafka.message.MessageAndMetadata that can be processed the following way:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn- message-to-vec
</div><div class="line" data-line="2">  &quot;returns a vector of all of the message fields&quot;
</div><div class="line" data-line="3">  [^kafka.message.MessageAndMetadata message]
</div><div class="line" data-line="4">  [(.topic message) (.offset message) (.partition message) (.key message) (.message message)])
</div></code></pre>
<p>The key and the message is a byte array than can be easily converted to a string.</p>
<h3 id="producer"><a href="#producer">Producer</a></h3>
<p>A Kafka producer is similar to a consumer, there is a producer connector that can be used to send a message.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(defn producer-connector
</div><div class="line" data-line="2">  [^clojure.lang.PersistentArrayMap h]
</div><div class="line" data-line="3">  (info &quot;fn: producer-connector&quot; &quot; config: &quot; h)
</div><div class="line" data-line="4">  (let [config (ProducerConfig. (hashmap-to-properties h))]
</div><div class="line" data-line="5">    (Producer. config)))
</div><div class="line" data-line="6">
</div><div class="line" data-line="7">(defn message
</div><div class="line" data-line="8">  [topic key value]
</div><div class="line" data-line="9">  (info &quot;fn: message&quot; &quot; topic: &quot; topic)
</div><div class="line" data-line="10">  (KeyedMessage. topic key value))
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">(defn produce
</div><div class="line" data-line="13">  [^Producer producer ^KeyedMessage message]
</div><div class="line" data-line="14">  (info &quot;fn: produce message: &quot; message)
</div><div class="line" data-line="15">  (.send producer message))
</div></code></pre>
<h2 id="credits"><a href="#credits">Credits</a></h2>
<p>The kudos go to @nikore.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 28 May 2014 20:11:02 +0100</pubDate>
    </item>
    <item>
      <title>Using custom schema with Riak Search 2.0</title>
      <link>https://dev.l1x.be/posts/2014/03/05/using-custom-schema-with-riak-search-2.0/</link>
      <guid isPermaLink="true">https://dev.l1x.be/posts/2014/03/05/using-custom-schema-with-riak-search-2.0/</guid>
      <content:encoded><![CDATA[<h2 id="riak-20"><a href="#riak-20">Riak 2.0</a></h2>
<p>There are a lot of new features coming down the pipe with Riak 2.0 but the most important one (at least to me) is Riak Search 2.0.</p>
<p>What is Riak Search 2.0 exactly? Riak is a very simple key value store with AP properties in the CAP theorem land. It scales very well and thanks to Erlang it is extremely reliable. There are very few features (and this is a great thing for a data store) so this is why introducing Solr as te revamped search is kind of big. I have never used Solr before so watch me fail or maybe succeed indexing some dbpedia documents.</p>
<h3 id="test-data"><a href="#test-data">Test data</a></h3>
<p>Getting the test data</p>
<p>Dbpedia is a data query interface (SQL like) to Wikipedia. I am going to query it for all the cities on the planet having population bigger than 50.000. The query looks the following  way:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">SELECT DISTINCT *
</div><div class="line" data-line="2">WHERE &lbrace;
</div><div class="line" data-line="3">  ?city rdf:type dbpedia-owl:City ;
</div><div class="line" data-line="4">  rdfs:label ?label;
</div><div class="line" data-line="5">  dbpedia-owl:abstract ?abstract ;
</div><div class="line" data-line="6">  dbpedia-owl:populationTotal ?pop ;
</div><div class="line" data-line="7">  dbpedia-owl:country ?country .
</div><div class="line" data-line="8">FILTER
</div><div class="line" data-line="9">  (lang(?abstract) = &#39;en&#39; &amp;&amp; lang(?label) = &#39;en&#39; &amp;&amp; ?pop &gt; 50000)
</div><div class="line" data-line="10">&rbrace;
</div><div class="line" data-line="11">ORDER BY ?pop
</div></code></pre>
<p><a href="http://goo.gl/0DyVdn">Dbpedia</a></p>
<p>The data needs to be sliced up to individual JSON files so we can load them into Riak easily and make Solr index the files using the custom schema. After removing some header information the data from dbpedia looks the following:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-json" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">[</span>
</div><div class="line" data-line="2">    <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="3">        <span style="color: #79c0ff;">&quot;abstract&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="4">            <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;literal&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="5">            <span style="color: #79c0ff;">&quot;value&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Shawinigan is a city located...&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="6">            <span style="color: #79c0ff;">&quot;xml:lang&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;en&quot;</span>
</div><div class="line" data-line="7">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="8">        <span style="color: #79c0ff;">&quot;city&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="9">            <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;uri&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="10">            <span style="color: #79c0ff;">&quot;value&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;http://dbpedia.org/resource/Shawinigan&quot;</span>
</div><div class="line" data-line="11">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="12">        <span style="color: #79c0ff;">&quot;country&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="13">            <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;uri&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="14">            <span style="color: #79c0ff;">&quot;value&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;http://dbpedia.org/resource/Canada&quot;</span>
</div><div class="line" data-line="15">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="16">        <span style="color: #79c0ff;">&quot;label&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="17">            <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;literal&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="18">            <span style="color: #79c0ff;">&quot;value&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;Shawinigan&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="19">            <span style="color: #79c0ff;">&quot;xml:lang&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;en&quot;</span>
</div><div class="line" data-line="20">        <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="21">        <span style="color: #79c0ff;">&quot;pop&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="22">            <span style="color: #79c0ff;">&quot;datatype&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;http://www.w3.org/2001/XMLSchema#integer&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="23">            <span style="color: #79c0ff;">&quot;type&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;typed-literal&quot;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="24">            <span style="color: #79c0ff;">&quot;value&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;50060&quot;</span>
</div><div class="line" data-line="25">        <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="26">    <span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">,</span>
</div><div class="line" data-line="27">    <span style="color: #e6edf3;">&lbrace;</span>
</div><div class="line" data-line="28">        <span style="color: #79c0ff;">&quot;another&quot;</span><span style="color: #e6edf3;">:</span> <span style="color: #a5d6ff;">&quot;city&quot;</span>
</div><div class="line" data-line="29">    <span style="color: #e6edf3;">&rbrace;</span>
</div><div class="line" data-line="30"><span style="color: #e6edf3;">]</span>
</div></code></pre>
<p>You get an idea, it is a nested data structure an array of smaller hashes. I have processed it with Clojure to get one enty per file, using uuids to have unique file names (keys).</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">(require &#39;[clojure.data.json :as json])
</div><div class="line" data-line="2">(defn uuid []
</div><div class="line" data-line="3">  &quot;Returns a new java.util.UUID as string&quot;
</div><div class="line" data-line="4">  (str (java.util.UUID/randomUUID)))
</div><div class="line" data-line="5">(def cities (slurp &quot;cities.pp.json&quot;))
</div><div class="line" data-line="6">(def cities-json (json/read-str cities))
</div><div class="line" data-line="7">(map #(spit (str &quot;t/&quot; (uuid) &quot;.json&quot;) (json/write-str %)) cities-json)
</div></code></pre>
<p>This produces a bunch of JSON files so I can upload it to Riak. Before we get there, lets start up and configure our Riak service.</p>
<h3 id="configuring-the-riak-cluster"><a href="#configuring-the-riak-cluster">Configuring the Riak cluster</a></h3>
<p>I am using the <a href="https://github.com/basho/yokozuna/tree/v0.14.0">Yokozuna</a> release, version 0.14.0. After downloading the source we need to create a devrel with two nodes. I assume you have Erlang and the build tools installed.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">make</span> <span style="color: #e6edf3;">stagedevrel</span> <span style="color: #e6edf3;">DEVNODES=2</span>
</div></code></pre>
<p>There might be some libs missing but I don't want to go too much into the details about the operating system specific part of the story. After the dev nodes are created we need to configure Riak and enabled search. I prefer to use LevelDB as the persistent store and I would like to make Riak listen on all of the available interfaces, making our lives easier in a virtualized environment. Let's do all of these.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">d</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">dev/dev?</span><span style="color: #e6edf3;">;</span> <span style="color: #d2a8ff;">do</span>
</div><div class="line" data-line="2">    <span style="color: #d2a8ff;">sed</span> <span style="color: #e6edf3;">-e</span> <span style="color: #a5d6ff;">&#39;s/storage_backend = bitcask/storage_backend = leveldb/&#39;</span> \
</div><div class="line" data-line="3">    <span style="color: #e6edf3;">-i.back</span> <span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">d</span><span style="color: #e6edf3;">/etc/riak.conf</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="4">  <span style="color: #d2a8ff;">done</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">d</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">dev/dev?</span><span style="color: #e6edf3;">;</span> <span style="color: #d2a8ff;">do</span>
</div><div class="line" data-line="6">    <span style="color: #d2a8ff;">sed</span> <span style="color: #e6edf3;">-e</span> <span style="color: #a5d6ff;">&#39;s/search = off/search = on/&#39;</span> <span style="color: #e6edf3;">-i.back</span> <span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">d</span><span style="color: #e6edf3;">/etc/riak.conf</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="7">  <span style="color: #d2a8ff;">done</span>
</div><div class="line" data-line="8"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">d</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">dev/dev?</span><span style="color: #e6edf3;">;</span> <span style="color: #d2a8ff;">do</span>
</div><div class="line" data-line="9">    <span style="color: #d2a8ff;">sed</span> <span style="color: #e6edf3;">-e</span> <span style="color: #a5d6ff;">&#39;s/127.0.0.1/0.0.0.0/&#39;</span> <span style="color: #e6edf3;">-i.back</span> <span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">d</span><span style="color: #e6edf3;">/etc/riak.conf</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="10">  <span style="color: #d2a8ff;">done</span>
</div></code></pre>
<p>After the configuration part is done, start up the nodes, make them into one cluster and we are almost ready to start to shove data in.</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">cd</span> <span style="color: #e6edf3;">dev</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">i</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">&lbrace;</span><span style="color: #79c0ff;">1</span><span style="color: #79c0ff;">..</span><span style="color: #79c0ff;">2</span><span style="color: #e6edf3;">&rbrace;</span><span style="color: #e6edf3;">;</span> <span style="color: #d2a8ff;">do</span>
</div><div class="line" data-line="3">    <span style="color: #a5d6ff;">dev</span><span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">i</span><span style="color: #a5d6ff;">/bin/riak</span> <span style="color: #e6edf3;">start</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="4">  <span style="color: #d2a8ff;">done</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">dev2</span><span style="color: #a5d6ff;">/bin/riak-admin</span> <span style="color: #e6edf3;">cluster</span> <span style="color: #e6edf3;">join</span> <span style="color: #e6edf3;">dev1@0.0.0.0</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">dev2</span><span style="color: #a5d6ff;">/bin/riak-admin</span> <span style="color: #e6edf3;">cluster</span> <span style="color: #e6edf3;">plan</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">dev2</span><span style="color: #a5d6ff;">/bin/riak-admin</span> <span style="color: #e6edf3;">cluster</span> <span style="color: #e6edf3;">commit</span>
</div></code></pre>
<p>Checking the member status to verify if the data is evenly distributed among the nodes:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">dev2</span><span style="color: #a5d6ff;">/bin/riak-admin</span> <span style="color: #e6edf3;">member-status</span>
</div><div class="line" data-line="2">
</div><div class="line" data-line="3"><span style="color: #d2a8ff;">==============================</span> <span style="color: #e6edf3;">Membership</span> <span style="color: #e6edf3;">===============================</span>
</div><div class="line" data-line="4"><span style="color: #d2a8ff;">Status</span>     <span style="color: #e6edf3;">Ring</span>    <span style="color: #e6edf3;">Pending</span>    <span style="color: #e6edf3;">Node</span>
</div><div class="line" data-line="5"><span style="color: #d2a8ff;">-------------------------------------------------------------------------</span>
</div><div class="line" data-line="6"><span style="color: #d2a8ff;">valid</span>      <span style="color: #e6edf3;">50.0%</span>      <span style="color: #e6edf3;">--</span>      <span style="color: #a5d6ff;">&#39;dev1@0.0.0.0&#39;</span>
</div><div class="line" data-line="7"><span style="color: #d2a8ff;">valid</span>      <span style="color: #e6edf3;">50.0%</span>      <span style="color: #e6edf3;">--</span>      <span style="color: #a5d6ff;">&#39;dev2@0.0.0.0&#39;</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">-------------------------------------------------------------------------</span>
</div><div class="line" data-line="9"><span style="color: #d2a8ff;">Valid:2</span> <span style="color: #e6edf3;">/</span> <span style="color: #e6edf3;">Leaving:0</span> <span style="color: #e6edf3;">/</span> <span style="color: #e6edf3;">Exiting:0</span> <span style="color: #e6edf3;">/</span> <span style="color: #e6edf3;">Joining:0</span> <span style="color: #e6edf3;">/</span> <span style="color: #e6edf3;">Down:0</span>
</div></code></pre>
<h3 id="preparing-the-bucket-and-loading-the-data"><a href="#preparing-the-bucket-and-loading-the-data">Preparing the bucket and loading the data</a></h3>
<p>In this section we are going to create a Solr schema and index so that we can index the documents. Think about the schema as the merit how much Solr understands the data. It can be configured to reference individual elements in complex nested data structures. The data type can be also configured, that makes range queries possible for numeric data. I dont wan't to go too much into the details of Solr, it is worth to spend few hours on the <a href="http://heliosearch.org/solr/getting-started/">documentation</a>, I am just scratching the surface in this post.</p>
<h4 id="creating-an-index-with-a-custom-schema"><a href="#creating-an-index-with-a-custom-schema">Creating an index with a custom schema</a></h4>
<p>First thing first, we need to create a schema that is used by the index. I am not really a Solr expert but here is what I came up with:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-plaintext" translate="no" tabindex="0"><div class="line" data-line="1">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
</div><div class="line" data-line="2">&lt;schema name=&quot;default&quot; version=&quot;1.5&quot;&gt;
</div><div class="line" data-line="3">  &lt;fields&gt;
</div><div class="line" data-line="4">    &lt;field name=&quot;abstract.value&quot; type=&quot;text_general&quot; indexed=&quot;true&quot; stored=&quot;true&quot; multiValued=&quot;true&quot; /&gt;
</div><div class="line" data-line="5">    &lt;field name=&quot;city.value&quot;     type=&quot;string&quot;       indexed=&quot;true&quot; stored=&quot;true&quot; /&gt;
</div><div class="line" data-line="6">    &lt;field name=&quot;country.value&quot;  type=&quot;string&quot;       indexed=&quot;true&quot; stored=&quot;true&quot; /&gt;
</div><div class="line" data-line="7">    &lt;field name=&quot;label.value&quot;    type=&quot;string&quot;       indexed=&quot;true&quot; stored=&quot;true&quot; /&gt;
</div><div class="line" data-line="8">    &lt;field name=&quot;pop.value&quot;      type=&quot;int&quot;          indexed=&quot;true&quot; stored=&quot;true&quot; /&gt;
</div><div class="line" data-line="9">
</div><div class="line" data-line="10">    &lt;dynamicField name=&quot;*&quot; type=&quot;text_general&quot; indexed=&quot;true&quot; stored=&quot;false&quot; multiValued=&quot;true&quot; /&gt;
</div><div class="line" data-line="11">
</div><div class="line" data-line="12">    &lt;field name=&quot;_yz_id&quot; type=&quot;_yz_str&quot; indexed=&quot;true&quot; stored=&quot;true&quot; required=&quot;true&quot;/&gt;
</div><div class="line" data-line="13">
</div><div class="line" data-line="14">    &lt;!-- Same as the default from here..... --&gt;
</div><div class="line" data-line="15">
</div><div class="line" data-line="16">  &lt;/fields&gt;
</div><div class="line" data-line="17">&lt;/schema&gt;
</div></code></pre>
<p>Upload the schema to Riak and creating the index:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">curl</span> <span style="color: #e6edf3;">-XPUT</span> \
</div><div class="line" data-line="2">  <span style="color: #e6edf3;">-d</span> <span style="color: #e6edf3;">@sch_cities.xml</span> \
</div><div class="line" data-line="3">  <span style="color: #e6edf3;">-H</span> <span style="color: #a5d6ff;">&#39;Content-Type: application/xml&#39;</span> \
</div><div class="line" data-line="4">  <span style="color: #a5d6ff;">&#39;http://10.0.3.81:10018/search/schema/sch_cities&#39;</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">curl</span> <span style="color: #e6edf3;">-XPUT</span> \
</div><div class="line" data-line="6">  <span style="color: #e6edf3;">-i</span> <span style="color: #a5d6ff;">&quot;http://10.0.3.81:10018/search/index/idx_cities&quot;</span> \
</div><div class="line" data-line="7">  <span style="color: #e6edf3;">-H</span><span style="color: #a5d6ff;">&#39;content-type:application/json&#39;</span> \
</div><div class="line" data-line="8">  <span style="color: #e6edf3;">-d</span><span style="color: #a5d6ff;">&#39;&lbrace;&quot;schema&quot;:&quot;sch_cities&quot;&rbrace;&#39;</span>
</div></code></pre>
<h4 id="creating-a-bucket-type"><a href="#creating-a-bucket-type">Creating a bucket type</a></h4>
<p>In Riak 2.0 there is a new feature called <a href="http://docs.basho.com/riak/2.0.0pre11/dev/advanced/bucket-types/">bucket types</a> that allows groups of bucket to have the same configuration details.</p>
<p>Creating a new bucket type called cities and activating it:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">riak</span><span style="color: #a5d6ff;">-admin</span> <span style="color: #e6edf3;">bucket-type</span> <span style="color: #e6edf3;">create</span> <span style="color: #e6edf3;">cities</span> <span style="color: #a5d6ff;">&#39;&lbrace;&quot;props&quot;:&lbrace;&quot;search_index&quot;:&quot;idx_cities&quot;&rbrace;&#39;</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">riak</span><span style="color: #a5d6ff;">-admin</span> <span style="color: #e6edf3;">bucket-type</span> <span style="color: #e6edf3;">activate</span> <span style="color: #e6edf3;">cities</span>
</div></code></pre>
<h4 id="loading-the-data"><a href="#loading-the-data">Loading the data</a></h4>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #e6edf3;">$</span> <span style="color: #e6edf3;">for</span> <span style="color: #e6edf3;">i</span> <span style="color: #e6edf3;">in</span> <span style="color: #e6edf3;">*.json</span> <span style="color: #e6edf3;">;</span> <span style="color: #d2a8ff;">do</span>
</div><div class="line" data-line="2">    <span style="color: #d2a8ff;">curl</span> <span style="color: #e6edf3;">-XPUT</span> <span style="color: #e6edf3;">-d</span> <span style="color: #e6edf3;">@</span><span style="color: #a5d6ff;">&quot;<span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">i</span>&quot;</span> \
</div><div class="line" data-line="3">    <span style="color: #e6edf3;">-H</span> <span style="color: #a5d6ff;">&#39;Content-Type: application/json&#39;</span> \
</div><div class="line" data-line="4">    <span style="color: #e6edf3;">http://10.3.1.10:10018/types/cities/buckets/cities/keys/</span><span style="color: #a5d6ff;">&quot;<span style="color: #e6edf3;">$</span><span style="color: #e6edf3;">i</span>&quot;</span><span style="color: #e6edf3;">;</span>
</div><div class="line" data-line="5">  <span style="color: #d2a8ff;">done</span>
</div></code></pre>
<h4 id="querying-solr"><a href="#querying-solr">Querying Solr</a></h4>
<p>The search index can be queryed two ways:</p>
<ul>
<li>using the Solr interface</li>
<li>using Riak</li>
</ul>
<p>The syntax is the same, lets find the first 100 cities with population between 51000 and 52000, display only the name and the population and order it by the population. With Solr very comprehensive query syntax you end up with somethig like this:</p>
<pre class="athl code-block" style="color: #e6edf3; background-color: #30363d;"><code class="language-bash" translate="no" tabindex="0"><div class="line" data-line="1"><span style="color: #d2a8ff;">http://10.3.0.10:10018/search/idx_cities?</span>
</div><div class="line" data-line="2"><span style="color: #e6edf3;">q</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">*:*</span><span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="3"><span style="color: #e6edf3;">rows</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">100</span><span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="4"><span style="color: #e6edf3;">fl</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">label.value,pop.value</span><span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="5"><span style="color: #e6edf3;">wt</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">json</span><span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="6"><span style="color: #e6edf3;">indent</span><span style="color: #79c0ff;">=</span><span style="color: #79c0ff;">true</span><span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="7"><span style="color: #e6edf3;">fq</span><span style="color: #79c0ff;">=</span><span style="color: #a5d6ff;">pop.value:</span><span style="color: #a5d6ff;">[</span><span style="color: #a5d6ff;">51000+TO+52000</span><span style="color: #a5d6ff;">]</span><span style="color: #e6edf3;">&amp;</span>
</div><div class="line" data-line="8"><span style="color: #d2a8ff;">sort:pop.value</span>
</div></code></pre>
<h3 id="closing-thoughts"><a href="#closing-thoughts">Closing thoughts</a></h3>
<p>I am a huge fan. Well, Riak is my favorite simple key-value store (o hai GET/PUT) and with Solr it makes life really easy to index what is stored in your system. I think it might be a bad idea to run Solr on the same nodes as Riak, finding bugs would be painful, but other than that I am pretty happy with the state of Yokozuna now.</p>
<h3 id="credits"><a href="#credits">Credits</a></h3>
<p>The kudos go to #riak on freenode especially to @coderoshi and @nikore.</p>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Wed, 05 Mar 2014 21:01:08 +0100</pubDate>
    </item>
    <item>
      <title>Wired</title>
      <link>https://dev.l1x.be/music/wired/</link>
      <guid isPermaLink="true">https://dev.l1x.be/music/wired/</guid>
      <content:encoded><![CDATA[<h1 id="hyper-wired"><a href="#hyper-wired">Hyper - Wired</a></h1>
<ul>
<li>Mobilegazer – My House 3:31</li>
<li>DJ Hyper – Shock Proof 5:49</li>
<li>DJ Hyper vs. General Midi – We've Been Waiting 4:50</li>
<li>Moguai – Get On (Hyper Mix) 6:32</li>
<li>PMT – Necromancer 5:34</li>
<li>Plump DJs – Pray For You (Lee Coombs Vocal Mix) 7:01</li>
<li>Planet Funk – Who Said (Moguai Mix) 6:58</li>
<li>DJ Hyper – Outsider 5:27</li>
<li>Infusion – Legacy (Junkie XL Remix) 6:47</li>
<li>General Midi – Entertainer 4:50</li>
<li>Soul Of Man – Dirty Waltzer (Santos Another Planet Reprise) 5:28</li>
<li>DJ Flywheel – Weatherman 4:22</li>
<li>Dan F – Line Of Sight (Original) 5:20</li>
<li>Marscruiser vs. Andy Page – Elementalectrofunk 4:50</li>
<li>DJ Hyper – Body Rok 6:18</li>
<li>Chable &amp; Bonnici – Ride (Have A Break Mix) 5:34</li>
<li>DJ Hyper vs. Überzone – Wubbie (Prototype Mix) 5:05</li>
<li>Sugababes – In The Middle (Hyper Mix) 6:20</li>
<li>Stir Fry – 'Lectro Chunk 5:12</li>
<li>Attack Force – Gourilla 6:01</li>
<li>Santos – Sabot (Santos VIP Remix) 5:14</li>
</ul>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Sun, 14 Nov 2004 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Eargasm</title>
      <link>https://dev.l1x.be/music/eargasm/</link>
      <guid isPermaLink="true">https://dev.l1x.be/music/eargasm/</guid>
      <content:encoded><![CDATA[<h1 id="plump-djs-eargasm"><a href="#plump-djs-eargasm">Plump DJs – Eargasm</a></h1>
<ul>
<li>Creepshow 6:11</li>
<li>Weighed Down 5:04</li>
<li>The Funk Hits The Fan 5:30</li>
<li>In Stereo 3:06</li>
<li>The Gate 8:06</li>
<li>Morning Sun 4:52</li>
<li>Mantra 7:22</li>
<li>Pray For You 5:18</li>
<li>Something Goin' On 4:20</li>
<li>Contact Double Zero 5:48</li>
<li>How Much Is Enough 5:47</li>
<li>Cry Wolf 6:55</li>
<li>Tilt 4:50</li>
</ul>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 14 Nov 2003 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Eye For An Eye</title>
      <link>https://dev.l1x.be/music/eye-for-an-eye/</link>
      <guid isPermaLink="true">https://dev.l1x.be/music/eye-for-an-eye/</guid>
      <content:encoded><![CDATA[<h1 id="unkle-eye-for-an-eye"><a href="#unkle-eye-for-an-eye">UNKLE - Eye For An Eye</a></h1>
<ul>
<li>Eye For An Eye (Album Version)</li>
<li>Eye For An Eye (Force Mass Motion Versus Dylan Rhymes Remix)</li>
</ul>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 14 Nov 2003 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Fractured</title>
      <link>https://dev.l1x.be/music/fractured/</link>
      <guid isPermaLink="true">https://dev.l1x.be/music/fractured/</guid>
      <content:encoded><![CDATA[<h1 id="hyper-fractured"><a href="#hyper-fractured">Hyper - Fractured</a></h1>
<p>Dislocated:</p>
<ul>
<li>1.1 Momu - Sunsicle</li>
<li>1.2 Terminalhead - Head Down</li>
<li>1.3 Kemek the Dope Computer - Let Yourself Go</li>
<li>1.4 Semi Detached - Who Da Fuck (False Prophet Remix)</li>
<li>1.5 Spork - Freeek Like Me</li>
<li>1.6 Stir Fry - Freestyle Flow</li>
<li>1.7 Meat Katie and Christian J - Cusp</li>
<li>1.8 Überzone and Rennie Pilgrem - Cous Cous (Royale Mix)</li>
<li>1.9 Oakenfold - Starry Eyed Surprise (Stir Fry Vocal)</li>
<li>1.10 Hyper - Catnip</li>
<li>1.11 PMT - Insinuendo (Principled Dub Mix)</li>
<li>1.12 Timo Maas - Der Schieber (Funkin' for Hope in New York Mix)</li>
</ul>
<p>Bruised:</p>
<ul>
<li>2.1 Proper Filthy Naughty - Beautiful Day feat. Jo Morgan</li>
<li>2.2 Dan F - Close Yer Eyez</li>
<li>2.3 Soul of Man - Acid Punch (Shock Proof Remix)</li>
<li>2.4 Dr. Motte and Westbam - Sunshine (Electro Dub Remix by WestBam)</li>
<li>2.5 Silencer - Rollin' and Controllin'</li>
<li>2.6 General Midi - You Will Be Under</li>
<li>2.7 Stisch - Poolswinger</li>
<li>2.8 Blame - Music Takes You (BLIM Remix)</li>
<li>2.9 Hyper - Slapper</li>
<li>2.10 Ils - Music (Evil 9 'Punk Rocks' Remix)</li>
<li>2.11 Fatliners - Flying feat. Spee</li>
</ul>]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 14 Nov 2003 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Retro &gt; Future</title>
      <link>https://dev.l1x.be/music/retro-future/</link>
      <guid isPermaLink="true">https://dev.l1x.be/music/retro-future/</guid>
      <content:encoded><![CDATA[<h1 id="unkle-eye-for-an-eye"><a href="#unkle-eye-for-an-eye">UNKLE - Eye For An Eye</a></h1>
<p>Retro Mixed By Phil K:</p>
<ul>
<li>1.01 Spoon Wizard – Shoe Monkey</li>
<li>1.02 Silken – Gizmo (Spoon Wizard Mix)</li>
<li>1.03 Johnny Dangerously vs. Darren Chapman – Fists Like This (Frakkar Mix)</li>
<li>1.04 Silken – Gizmo</li>
<li>1.05 DJ Killer – I Want Your Love</li>
<li>1.06 Sensei – I'm The Only One</li>
<li>1.07 Momu – Kitty Hawk</li>
<li>1.08 Blue Effect – Aquathought</li>
<li>1.09 Phantom Beats – Stealth (Stabilizer Mix)</li>
<li>1.10 Sensei – Pimp Slap The Funk</li>
<li>1.11 An-ten-nae – Static</li>
<li>1.12 Stabilizer – Carbon</li>
<li>1.13 Frakkar – Slide</li>
</ul>
<p>Future Selected By Ben &amp; Lex:</p>
<ul>
<li>2.01 LP – Keihatsu</li>
<li>2.02 Spoon Wizard – Cutlery Charm</li>
<li>2.03 Ben &amp; Lex – Cosa Nostra</li>
<li>2.04 DJ Killer – Recycled</li>
<li>2.05 An-ten-nae – Dubout</li>
<li>2.06 Blue Effect – Flow</li>
<li>2.07 DJ Killer – Thunder</li>
<li>2.08 Frakkar – Interloper</li>
<li>2.09 Spoon Wizard – There Is No Spoon (Ben &amp; Lex Mix)</li>
<li>2.10 Diversion Distraction – Under Thunder</li>
</ul>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 14 Nov 2003 11:21:21 +0200</pubDate>
    </item>
    <item>
      <title>Signals</title>
      <link>https://dev.l1x.be/music/signals/</link>
      <guid isPermaLink="true">https://dev.l1x.be/music/signals/</guid>
      <content:encoded><![CDATA[<h1 id="silencer-signals"><a href="#silencer-signals">Silencer – Signals</a></h1>
<ul>
<li>Wired 4:00</li>
<li>Taking Hold 5:32</li>
<li>Rollin' n Controllin' 6:42</li>
<li>Believing 5:35</li>
<li>Bubblewrap 6:11</li>
<li>Dubshot 5:28</li>
<li>Rocksteady 3:58</li>
<li>Continuity 3:15</li>
<li>Drown In Me 4:56</li>
<li>No Escape 4:04</li>
</ul>
]]></content:encoded>
      <author>Istvan</author>
      <pubDate>Fri, 14 Nov 2003 11:21:21 +0200</pubDate>
    </item>
  </channel>
</rss>
