Jekyll2023-02-27T10:21:34-03:00https://blog.alebian.com/feed.xmlalebian’s blogAlejandro Bezdjian's technical blog about programming.Alejandro BezdjianAdd changes to previous git commits2022-12-28T00:00:00-03:002022-12-28T00:00:00-03:00https://blog.alebian.com/git/2022/12/28/add-changes-to-previous-commit<p>When we are developing, it is always recommended to commit our changes often and in logical units. For example, in a web application, we might need to add some field in the database and let the user change that field in a form. Using an MVC framework the steps would look like:</p>
<ol>
<li>Add a migration that adds the field in the database</li>
<li>Add the field in the model</li>
<li>Pass the user input from the controller to the service that creates/updates the model</li>
<li>Add the input in the form</li>
</ol>
<p>A common git practice would be to create commits for:</p>
<ol>
<li>Adding the migration</li>
<li>Change the model, controller and service</li>
<li>Modify the form</li>
</ol>
<p>Now, a very common scenario is that when we are changing the form we find out there was a bug in the service, or we want to change the field name to something more meaningful, or that we forgot to add the field in the model, etc. So how do we add those changes to previously committed files without removing our commits?</p>
<p>I’m going to show you 2 ways you can do this depending on how many commits behind you want to put those changes.</p>
<h3 id="put-your-changes-in-the-last-commit">Put your changes in the last commit</h3>
<p>This is the simplest case, and you can use the following command that you may already know:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git add ...files
git commit <span class="nt">--amend</span> <span class="nt">--no-edit</span>
</code></pre></div></div>
<p>The <code class="highlighter-rouge">--no-edit</code> flag will skip the step where you can edit the commit message, if you don’t pass this flag git will show you a text editor.</p>
<h3 id="put-changes-in-older-commits">Put changes in older commits</h3>
<p>This is the more complex case and we are going to need the commit’s hash we want to change. Following the previous MVC example, running <code class="highlighter-rouge">git log</code> returns something like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>commit cc8a75f6753d07afaf0ece2b1a4eb26fbbfb3ec6 (HEAD -> main)
Changed form
commit fa4f42bd629de581b48024465f90e61bf71734ae
Changes in backend
commit 94b3af6d73d1c17518785a15a5e26c1ec3ce36fd
Added migration
</code></pre></div></div>
<p>Let’s say that we want to change the field name in the database, so out target commit is the one we added the migration (<code class="highlighter-rouge">94b3af6d73d1c17518785a15a5e26c1ec3ce36fd</code>). To do this you can run:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git add migration_file
git commit <span class="nt">--fixup</span><span class="o">=</span>94b3af6d73d1c17518785a15a5e26c1ec3ce36fd
git rebase <span class="nt">--interactive</span> <span class="nt">--autosquash</span> 94b3af6d73d1c17518785a15a5e26c1ec3ce36fd^
</code></pre></div></div>
<p>The <code class="highlighter-rouge">git commit --fixup=HASH</code> will create a commit with the same message as the mentioned commit with a prepended <code class="highlighter-rouge">fixup!</code>.
Then the <code class="highlighter-rouge">git rebase --interactive --autosquash HASH^</code> will automatically change the rebase message for you putting the fixup commit in the correct place (if the commit hash is correct and you used <code class="highlighter-rouge">^</code> at the end) and changing the word <code class="highlighter-rouge">edit</code> to <code class="highlighter-rouge">fixup</code>. After saving the message git will perform the desired changes.</p>
<h3 id="another-way-to-make-changes-in-older-commits">Another way to make changes in older commits</h3>
<p>The previous way is a simplified version of a more complex process that you can do:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git stash
git rebase <span class="nt">-i</span> HEAD~3
Mark the commit you want to change by replacing edit with pick
git stash pop
git add ...files
git commit <span class="nt">--amend</span> <span class="nt">--no-edit</span>
git rebase <span class="nt">--continue</span>
</code></pre></div></div>
<p>In this case we stash our changes, then perform an interactive rebase. When changing <code class="highlighter-rouge">edit</code> to <code class="highlighter-rouge">pick</code> you are telling git to stop the rebase in that commit so you can add all the changes that you want. After doing the changes (in this case by doing a <code class="highlighter-rouge">stash pop</code>) you ammend the changes into the commit that you picked and then continue the rebase process.</p>
<h2 id="very-important">Very important</h2>
<p>The 3 methods mentioned will change the hash of the modified commit and ALL it’s children! This is very important because if a colleague is working on the same branch or if there are branches coming out of any of those commits you will find a lot of conflicts and commit dupications.</p>Alejandro BezdjianWhen we are developing, it is always recommended to commit our changes often and in logical units. For example, in a web application, we might need to add some field in the database and let the user change that field in a form. Using an MVC framework the steps would look like:Implement a kafka like consumer strategy to process records in your database2020-11-13T00:00:00-03:002020-11-13T00:00:00-03:00https://blog.alebian.com/sql/postgresql/algorithms/2020/11/13/sql-kafka-like-consumer-strategy<p>Let’s say you have a table in your database which contains data about some files that you would like to process in batch jobs (this is just an example, it can be anything other than files). You have some job that runs every X minutes and does something like:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">files</span>
<span class="k">WHERE</span> <span class="n">processed</span> <span class="o">=</span> <span class="k">false</span>
<span class="k">LIMIT</span> <span class="mi">100</span><span class="p">;</span>
</code></pre></div></div>
<p>Then, it process each row and updates it by flipping the <code class="highlighter-rouge">processed</code>. This simple implementation should work just fine if the worker can process the incoming data as fast as it is generated, but it will start lagging if not (or if you want to re-process all the rows in a reasonable time).</p>
<p>The natural thing to do in that case would be to add more workers to parallelize the process. The problem is that the query we are making can lead to problems, for example, since there is no mechanism to stop 2 workers from getting the same rows, some rows can get processed more than once.</p>
<p>I want to show you one way to solve this problem borrowing an idea from Kafka. In kafka you can scale a topic by adding more partitions to it, this will allow you to increase the number of consumers and your throughput. You can read more about kafka consumers <a href="https://www.oreilly.com/library/view/kafka-the-definitive/9781491936153/ch04.html">here</a>.</p>
<p><img src="https://www.oreilly.com/library/view/kafka-the-definitive/9781491936153/assets/ktdg_04in02.png" alt="" /></p>
<p>Kafka guarantees that the messages written into a topic will be stored in only one partition and will be consumed by only one consumer in a consumer group, and that’s exactly what we want in our case here.</p>
<p>In order to achieve this we can use a few useful SQL functions. First we are going to need a unique hash_code for each of our rows:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">files</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">hash_code</span> <span class="nb">BIGINT</span><span class="p">;</span>
</code></pre></div></div>
<p>The “partition” will be calculated from to this number, so make sure to use a good hashing function.</p>
<p>Every worker instance will act as a “consumer” in kafka, so they need a unique number to avoid processing the same rows, and just like kafka, we won’t be able to have more “consumers” than “partitions”. How to assign that number depends on your application and is out of the scope of this post.</p>
<p>The other important number that we need is the total amount of workers. Once we have those, let’s change our query to get the results we want:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">files</span>
<span class="k">WHERE</span> <span class="n">processed</span> <span class="o">=</span> <span class="k">false</span> <span class="k">AND</span> <span class="k">MOD</span><span class="p">(</span><span class="n">hash_code</span><span class="p">,</span> <span class="p">:</span><span class="n">instances</span><span class="p">)</span> <span class="o">=</span> <span class="p">:</span><span class="n">instanceId</span>
<span class="k">LIMIT</span> <span class="mi">100</span><span class="p">;</span>
</code></pre></div></div>
<p>The main idea behind this query is to use the <a href="https://en.wikipedia.org/wiki/Modulo_operation">modulo operation</a> to find the “partition”, so <code class="highlighter-rouge">MOD(hash_code, :instances)</code> will return that number (starting from 0).</p>
<p>Now, there is a problem with this query. Since we just added the <code class="highlighter-rouge">hash_code</code> column, there might be a lot of null values. To solve this issue, we can use the <a href="https://www.postgresqltutorial.com/postgresql-coalesce/">coalesce</a> function to turn each null value into a number. The final query should look like this:</p>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">files</span>
<span class="k">WHERE</span> <span class="n">processed</span> <span class="o">=</span> <span class="k">false</span> <span class="k">AND</span> <span class="k">MOD</span><span class="p">(</span><span class="n">COALESCE</span><span class="p">(</span><span class="n">hash_code</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span> <span class="p">:</span><span class="n">instances</span><span class="p">)</span> <span class="o">=</span> <span class="p">:</span><span class="n">instanceId</span>
<span class="k">LIMIT</span> <span class="mi">100</span><span class="p">;</span>
</code></pre></div></div>
<p>By doing this, you avoid the need to do a massive update in that table before being able to process the data, in this example all the null values will be processed only by the worker with ID 0.</p>
<p>And that’s it!</p>
<p>Of course this method isn’t perfect, you could have idle workers if the distribution of your data is not perfect.</p>
<p>What I like about this approach is that is easy to understand, and the required code is simple and elegant. Other solutions may involve the use of locks which are harder to get right in my opinion.</p>
<p>I hope you enjoyed this as much as I did! If you try it in your application let me know!</p>Alejandro BezdjianLet’s say you have a table in your database which contains data about some files that you would like to process in batch jobs (this is just an example, it can be anything other than files). You have some job that runs every X minutes and does something like:Implement a simple AWS S3 multi-file downloader in Ruby2020-03-09T00:00:00-03:002020-03-09T00:00:00-03:00https://blog.alebian.com/aws/s3/ruby/rails/webdev/2020/03/09/multi-file-zipper<p>Recently, I was requested to implement a multi-file download feature like Google drive’s. If you use AWS S3 like me, you know that there isn’t a direct way to do this, so in this post I’ll show you a reference implementation for that.</p>
<p>We are going to need a few dependencies in our project:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Gemfile</span>
<span class="n">gem</span> <span class="s1">'aws-sdk-s3'</span><span class="p">,</span> <span class="s1">'~> 1.17'</span>
<span class="n">gem</span> <span class="s1">'rubyzip'</span><span class="p">,</span> <span class="s1">'~> 2.2'</span>
</code></pre></div></div>
<p>This post assumes you know how to configure your AWS credentials.</p>
<p>Now let’s get to the code! The goal here is to create a zip file which has all our desired files so we can send it to our users. The first thing to do is to download the files, with the official SDK this is quite simple:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="vi">@filepaths</span> <span class="o">=</span> <span class="vi">@s3_keys</span><span class="p">.</span><span class="nf">map</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="o">|</span>
<span class="n">new_path</span> <span class="o">=</span> <span class="s2">"</span><span class="si">#{</span><span class="vi">@dir</span><span class="si">}</span><span class="s2">/</span><span class="si">#{</span><span class="n">key</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="s1">'/'</span><span class="p">).</span><span class="nf">last</span><span class="si">}</span><span class="s2">"</span> <span class="c1"># Keep the filename but avoid the fill path</span>
<span class="c1"># This should go in a separate service</span>
<span class="no">Aws</span><span class="o">::</span><span class="no">S3</span><span class="o">::</span><span class="no">Resource</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">bucket</span><span class="p">(</span><span class="vi">@bucket</span><span class="p">).</span><span class="nf">object</span><span class="p">(</span><span class="n">key</span><span class="p">).</span><span class="nf">download_file</span><span class="p">(</span><span class="n">new_path</span><span class="p">)</span>
<span class="n">new_path</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Now that we have the files in the machine, lets create a zip with all of them. Following the <a href="https://github.com/rubyzip/rubyzip#basic-zip-archive-creation">rubyzip’s documentation</a>, we can do create the zipped file like this:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Zip</span><span class="o">::</span><span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s1">'Archive.zip'</span><span class="p">,</span> <span class="no">Zip</span><span class="o">::</span><span class="no">File</span><span class="o">::</span><span class="no">CREATE</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">zipfile</span><span class="o">|</span>
<span class="vi">@filepaths</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">filepath</span><span class="o">|</span>
<span class="n">zipfile</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="n">filepath</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="s1">'/'</span><span class="p">).</span><span class="nf">last</span><span class="p">,</span> <span class="n">filepath</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>These simple steps will create our desired zip file! Now we need to put the file somewhere our users can access them. In my case I upload the zip in S3 and create a presigned url for it, but you can change this to fit your needs:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">object</span> <span class="o">=</span> <span class="no">Aws</span><span class="o">::</span><span class="no">S3</span><span class="o">::</span><span class="no">Resource</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">bucket</span><span class="p">(</span><span class="vi">@bucket</span><span class="p">).</span><span class="nf">object</span><span class="p">(</span><span class="s1">'Archive.zip'</span><span class="p">)</span>
<span class="n">object</span><span class="p">.</span><span class="nf">upload_file</span><span class="p">(</span><span class="s1">'Archive.zip'</span><span class="p">)</span>
<span class="n">download_url</span> <span class="o">=</span> <span class="n">object</span><span class="p">.</span><span class="nf">presigned_url</span><span class="p">(</span><span class="ss">:get</span><span class="p">)</span>
</code></pre></div></div>
<p>Now that you get the idea, let me join everything to get this code working:</p>
<p>It’s better to have a class that hides the S3 related code as much as possible:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># s3_service.rb</span>
<span class="nb">require</span> <span class="s1">'aws-sdk-s3'</span> <span class="c1"># This is not needed in Rails</span>
<span class="k">module</span> <span class="nn">S3Service</span>
<span class="k">class</span> <span class="o"><<</span> <span class="nb">self</span>
<span class="k">def</span> <span class="nf">upload_file</span><span class="p">(</span><span class="n">from</span><span class="p">:,</span> <span class="n">to</span><span class="p">:,</span> <span class="n">bucket</span><span class="p">:)</span>
<span class="n">object</span> <span class="o">=</span> <span class="n">object</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="ss">bucket: </span><span class="n">bucket</span><span class="p">)</span>
<span class="n">object</span><span class="p">.</span><span class="nf">upload_file</span><span class="p">(</span><span class="n">from</span><span class="p">)</span>
<span class="n">object</span><span class="p">.</span><span class="nf">presigned_url</span><span class="p">(</span><span class="ss">:get</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">download_file</span><span class="p">(</span><span class="n">key</span><span class="p">:,</span> <span class="n">to</span><span class="p">:,</span> <span class="n">bucket</span><span class="p">:)</span>
<span class="n">object</span> <span class="o">=</span> <span class="n">object</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="ss">bucket: </span><span class="n">bucket</span><span class="p">)</span>
<span class="n">object</span><span class="p">.</span><span class="nf">download_file</span><span class="p">(</span><span class="n">to</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">get_download_link</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="n">bucket</span><span class="p">:)</span>
<span class="n">object</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="ss">bucket: </span><span class="n">bucket</span><span class="p">).</span><span class="nf">presigned_url</span><span class="p">(</span><span class="ss">:get</span><span class="p">).</span><span class="nf">to_s</span>
<span class="k">end</span>
<span class="kp">private</span>
<span class="k">def</span> <span class="nf">object</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="n">bucket</span><span class="p">:)</span>
<span class="n">bucket</span><span class="p">(</span><span class="n">bucket</span><span class="p">).</span><span class="nf">object</span><span class="p">(</span><span class="n">file_name</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">bucket</span><span class="p">(</span><span class="n">bucket</span><span class="p">)</span>
<span class="no">Aws</span><span class="o">::</span><span class="no">S3</span><span class="o">::</span><span class="no">Resource</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">bucket</span><span class="p">(</span><span class="n">bucket</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Now we are ready to build another class that handles the download, zip and upload of the zipped file:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># multi_file_zipper_download.rb</span>
<span class="nb">require</span> <span class="s1">'zip'</span>
<span class="k">class</span> <span class="nc">MultiFileZipperDownload</span>
<span class="no">ZIPPED_FILE_NAME</span> <span class="o">=</span> <span class="s1">'Archive.zip'</span> <span class="c1"># Same as Google drive :P</span>
<span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">s3_keys</span><span class="p">,</span> <span class="n">bucket</span><span class="p">)</span>
<span class="vi">@s3_keys</span> <span class="o">=</span> <span class="n">s3_keys</span>
<span class="vi">@bucket</span> <span class="o">=</span> <span class="n">bucket</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">call</span>
<span class="n">zip_files</span><span class="p">(</span><span class="n">download_objects</span><span class="p">)</span>
<span class="n">build_zipped_s3_key</span>
<span class="n">upload_zip</span>
<span class="n">delete_tmp_file</span>
<span class="vi">@zipped_s3_key</span>
<span class="k">end</span>
<span class="kp">private</span>
<span class="k">def</span> <span class="nf">download_objects</span>
<span class="vi">@s3_keys</span><span class="p">.</span><span class="nf">map</span><span class="p">.</span><span class="nf">each_with_index</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">idx</span><span class="o">|</span>
<span class="c1"># Avoid replacing files with same name by using the index, you can skip this if you like</span>
<span class="n">new_path</span> <span class="o">=</span> <span class="s2">"</span><span class="si">#{</span><span class="n">tmp_dir</span><span class="si">}</span><span class="s2">/</span><span class="si">#{</span><span class="n">idx</span><span class="si">}</span><span class="s2"> - </span><span class="si">#{</span><span class="n">key</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="s1">'/'</span><span class="p">).</span><span class="nf">last</span><span class="si">}</span><span class="s2">"</span>
<span class="no">S3Service</span><span class="p">.</span><span class="nf">download_file</span><span class="p">(</span><span class="ss">key: </span><span class="n">key</span><span class="p">,</span> <span class="ss">to: </span><span class="n">new_path</span><span class="p">,</span> <span class="ss">bucket: </span><span class="vi">@bucket</span><span class="p">)</span>
<span class="n">new_path</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">zip_files</span><span class="p">(</span><span class="n">files</span><span class="p">)</span>
<span class="o">::</span><span class="no">Zip</span><span class="o">::</span><span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="n">zipped_file_path</span><span class="p">,</span> <span class="no">Zip</span><span class="o">::</span><span class="no">File</span><span class="o">::</span><span class="no">CREATE</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">zipfile</span><span class="o">|</span>
<span class="n">files</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">filepath</span><span class="o">|</span>
<span class="n">zipfile</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="n">filepath</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="s1">'/'</span><span class="p">).</span><span class="nf">last</span><span class="p">,</span> <span class="n">filepath</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">tmp_dir</span>
<span class="vi">@tmp_dir</span> <span class="o">||=</span> <span class="no">Dir</span><span class="p">.</span><span class="nf">mktmpdir</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">zipped_file_path</span>
<span class="s2">"</span><span class="si">#{</span><span class="n">tmp_dir</span><span class="si">}</span><span class="s2">/</span><span class="si">#{</span><span class="no">ZIPPED_FILE_NAME</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">build_zipped_s3_key</span>
<span class="c1"># I use the hash of the file to avoid collisions, but you can change this to whatever you like</span>
<span class="nb">hash</span> <span class="o">=</span> <span class="no">Digest</span><span class="o">::</span><span class="no">SHA256</span><span class="p">.</span><span class="nf">file</span><span class="p">(</span><span class="n">zipped_file_path</span><span class="p">).</span><span class="nf">to_s</span>
<span class="vi">@zipped_s3_key</span> <span class="o">=</span> <span class="s2">"multi_downloads/</span><span class="si">#{</span><span class="nb">hash</span><span class="si">}</span><span class="s2">/</span><span class="si">#{</span><span class="no">ZIPPED_FILE_NAME</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">upload_zip</span>
<span class="c1"># I upload the zipped file to S3 so we can send a link to tje file afterwards</span>
<span class="no">S3Service</span><span class="p">.</span><span class="nf">upload_file</span><span class="p">(</span><span class="ss">from: </span><span class="n">zipped_file_path</span><span class="p">,</span> <span class="ss">to: </span><span class="vi">@zipped_s3_key</span><span class="p">,</span> <span class="ss">bucket: </span><span class="vi">@bucket</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">delete_tmp_file</span>
<span class="c1"># Remove all the files to avoid disk usage leaks</span>
<span class="no">FileUtils</span><span class="p">.</span><span class="nf">rm_rf</span><span class="p">(</span><span class="n">tmp_dir</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>In order to use this class you can create a simple Sinatra server to test it:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># app.rb</span>
<span class="nb">require</span> <span class="s1">'sinatra'</span>
<span class="nb">require</span> <span class="s1">'json'</span>
<span class="c1"># Here you can initialize the AWS config, use ENV variables or any other valid configuration method</span>
<span class="nb">require_relative</span> <span class="s1">'s3_service'</span>
<span class="nb">require_relative</span> <span class="s1">'multi_file_zipper_download'</span>
<span class="no">BUCKET</span> <span class="o">=</span> <span class="s1">'my-bucket'</span>
<span class="n">get</span> <span class="s1">'/keys'</span> <span class="k">do</span>
<span class="c1"># Here you can send the list of available s3 keys using your desired storage</span>
<span class="k">end</span>
<span class="n">post</span> <span class="s1">'/download'</span> <span class="k">do</span>
<span class="c1"># Get the keys from the JSON body</span>
<span class="n">params</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">request</span><span class="p">.</span><span class="nf">body</span><span class="p">.</span><span class="nf">read</span><span class="p">)</span>
<span class="n">zipped_key</span> <span class="o">=</span> <span class="no">MultiFileZipperDownload</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="s1">'keys'</span><span class="p">],</span> <span class="no">BUCKET</span><span class="p">).</span><span class="nf">call</span>
<span class="n">url</span> <span class="o">=</span> <span class="no">S3Service</span><span class="p">.</span><span class="nf">get_download_link</span><span class="p">(</span><span class="n">zipped_key</span><span class="p">,</span> <span class="ss">bucket: </span><span class="no">BUCKET</span><span class="p">)</span>
<span class="p">[</span><span class="mi">200</span><span class="p">,</span> <span class="p">{</span> <span class="ss">url: </span><span class="n">url</span> <span class="p">}.</span><span class="nf">to_json</span><span class="p">]</span>
<span class="k">end</span>
</code></pre></div></div>
<p>This code will get you going with this feature, but you need to consider that downloading and uploading files from S3 will block your Ruby’s server, so you should run this code in background. I’ll leave that as an exercise for the reader.</p>Alejandro BezdjianRecently, I was requested to implement a multi-file download feature like Google drive’s. If you use AWS S3 like me, you know that there isn’t a direct way to do this, so in this post I’ll show you a reference implementation for that.Six degrees of Wikipedia2019-09-18T00:00:00-03:002019-09-18T00:00:00-03:00https://blog.alebian.com/ruby/programming/graphs/bfs/2019/09/18/six-degrees-of-wikipedia<p>A few years ago, I read an <a href="https://research.fb.com/blog/2016/02/three-and-a-half-degrees-of-separation/">interesting article</a> about the degrees of separation between Facebook users. Regardless of the result in the article, I thought it would be exciting to find the separation between other things, like Wikipedia articles.</p>
<p>For this problem, we are going to receive 2 valid Wikipedia articles and return a minimum list of articles that connects the two given. Two articles are connected if there is a link in one of the articles to the other (the other way around is not needed).</p>
<p>Let’s divide the problem into parts that we can solve separately:</p>
<ul>
<li>Get an article from wikipedia.</li>
<li>Get all the links in an article to other articles.</li>
<li>Find the minimum path between 2 articles.</li>
</ul>
<p>For the first two, we need to fetch an article using HTTP and then, find all the <code class="highlighter-rouge">a</code> tags that reference other articles. Inspecting the HTML we can see that every article has links that we want to skip (like the main page and external links):</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nn">Services</span>
<span class="k">class</span> <span class="nc">Wikipedia</span>
<span class="no">BASE_URL</span> <span class="o">=</span> <span class="s1">'https://en.wikipedia.org'</span><span class="p">.</span><span class="nf">freeze</span>
<span class="k">class</span> <span class="o"><<</span> <span class="nb">self</span>
<span class="k">def</span> <span class="nf">article_links</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
<span class="n">article</span> <span class="o">=</span> <span class="n">get_article</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
<span class="n">article</span><span class="p">.</span><span class="nf">css</span><span class="p">(</span><span class="s1">'a'</span><span class="p">).</span><span class="nf">each_with_object</span><span class="p">([])</span> <span class="k">do</span> <span class="o">|</span><span class="n">link</span><span class="p">,</span> <span class="n">array</span><span class="o">|</span>
<span class="n">href</span> <span class="o">=</span> <span class="n">link</span><span class="p">[</span><span class="s1">'href'</span><span class="p">]</span>
<span class="n">array</span> <span class="o"><<</span> <span class="n">href</span> <span class="k">if</span> <span class="n">internal?</span><span class="p">(</span><span class="n">href</span><span class="p">)</span> <span class="o">&&</span> <span class="o">!</span><span class="n">array</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="n">href</span><span class="p">)</span>
<span class="n">array</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="kp">private</span>
<span class="k">def</span> <span class="nf">get_article</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
<span class="n">uri</span> <span class="o">=</span> <span class="no">URI</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="no">BASE_URL</span><span class="si">}#{</span><span class="n">article</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="no">Nokogiri</span><span class="o">::</span><span class="no">HTML</span><span class="p">(</span><span class="n">uri</span><span class="p">.</span><span class="nf">read</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">internal?</span><span class="p">(</span><span class="n">link</span><span class="p">)</span>
<span class="n">link</span> <span class="o">=~</span> <span class="sr">/^\/wiki\/*/</span><span class="err"> && !l</span><span class="sr">in</span><span class="n">k</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="s1">':'</span><span class="p">)</span> <span class="o">&&</span> <span class="n">link</span> <span class="o">!=</span> <span class="s1">'/wiki/Main_Page'</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>I used the <a href="https://nokogiri.org/">Nokogiri</a> gem to parse the HTML and easily search the <code class="highlighter-rouge">a</code> tags.</p>
<p>Playing with the code I found that it would be nice to have some sort of cache so we don’t loose the data fetched from Wikipedia. I used Redis for this:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nn">Repositories</span>
<span class="k">class</span> <span class="nc">Redis</span>
<span class="no">REDIS_URL</span> <span class="o">=</span> <span class="s1">'redis://localhost:6379'</span><span class="p">.</span><span class="nf">freeze</span>
<span class="k">def</span> <span class="nf">initialize</span>
<span class="vi">@connection</span> <span class="o">=</span> <span class="o">::</span><span class="no">Redis</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">url: </span><span class="no">REDIS_URL</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">get_links</span><span class="p">(</span><span class="n">path</span><span class="p">)</span>
<span class="k">return</span> <span class="no">Oj</span><span class="p">.</span><span class="nf">load</span><span class="p">(</span><span class="vi">@connection</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="n">path</span><span class="p">))</span> <span class="k">if</span> <span class="vi">@connection</span><span class="p">.</span><span class="nf">exists</span><span class="p">(</span><span class="n">path</span><span class="p">)</span>
<span class="n">links</span> <span class="o">=</span> <span class="no">Services</span><span class="o">::</span><span class="no">Wikipedia</span><span class="p">.</span><span class="nf">article_links</span><span class="p">(</span><span class="n">path</span><span class="p">)</span>
<span class="vi">@connection</span><span class="p">.</span><span class="nf">set</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">links</span><span class="p">.</span><span class="nf">to_json</span><span class="p">)</span>
<span class="n">links</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>The last part of the task is the actual algorithm. In this case, we can think of the articles as nodes of a graph and the links as the vertices, forming a directed graph. Since accessing each node has the same cost (an HTTP request), we can pretend this is an unweighted graph or a weighted one with the same cost on every edge.</p>
<p>To build the graph I’ll use a gem called <a href="https://github.com/monora/rgl">RGL</a> and it also gives us a search algorithm so we don’t have to implement one. Since we don’t have the graph, we have to be smart about the way we create it. Traversing the links using DFS (depth-first search) may lead us to create a graph with an unnecesary amount of nodes, so I think it’s better to use BFS (breadth-first search) to build it:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nn">Crawlers</span>
<span class="k">class</span> <span class="nc">Graph</span> <span class="o"><</span> <span class="no">Base</span>
<span class="k">def</span> <span class="nf">call</span>
<span class="n">graph</span> <span class="o">=</span> <span class="no">RGL</span><span class="o">::</span><span class="no">DirectedAdjacencyGraph</span><span class="p">.</span><span class="nf">new</span>
<span class="n">queue</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">queue</span><span class="p">.</span><span class="nf">push</span><span class="p">(</span><span class="vi">@from_path</span><span class="p">)</span>
<span class="k">while</span> <span class="p">(</span><span class="n">current</span> <span class="o">=</span> <span class="n">queue</span><span class="p">.</span><span class="nf">shift</span><span class="p">)</span> <span class="o">!=</span> <span class="kp">nil</span>
<span class="n">links</span> <span class="o">=</span> <span class="vi">@repository</span><span class="p">.</span><span class="nf">get_links</span><span class="p">(</span><span class="n">current</span><span class="p">)</span>
<span class="n">links</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">link</span><span class="o">|</span>
<span class="k">unless</span> <span class="n">graph</span><span class="p">.</span><span class="nf">has_vertex?</span><span class="p">(</span><span class="n">link</span><span class="p">)</span>
<span class="n">graph</span><span class="p">.</span><span class="nf">add_edge</span><span class="p">(</span><span class="n">current</span><span class="p">,</span> <span class="n">link</span><span class="p">)</span>
<span class="n">queue</span><span class="p">.</span><span class="nf">push</span><span class="p">(</span><span class="n">link</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">if</span> <span class="n">links</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="vi">@to_path</span><span class="p">)</span>
<span class="k">return</span> <span class="n">graph</span><span class="p">.</span><span class="nf">dijkstra_shortest_path</span><span class="p">(</span><span class="no">EdgeWeightHack</span><span class="p">.</span><span class="nf">new</span><span class="p">,</span> <span class="vi">@from_path</span><span class="p">,</span> <span class="vi">@to_path</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">class</span> <span class="nc">EdgeWeightHack</span>
<span class="c1"># The dijkstra_shortest_path expects a hash</span>
<span class="k">def</span> <span class="nf">[]</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
<span class="mi">1</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>As you can see I used a queue for adding the nodes in a BFS manner and skipped them if the graph already had the link. I found that the RGL gem didn’t have a search algorithm for unweighted graphs but it did have Dijkstra’s algorithm for weighted ones. It expected a hash with the edge values, but since every edge has the same value and I didn’t want to use unnecesary extra memory, I hacked a class called <code class="highlighter-rouge">EdgeWeightHack</code> taking advantage of Ruby’s duck typing.</p>
<p>Now let’s see this in action! Let’s find out how separated is <code class="highlighter-rouge">Chuck Norris</code> from <code class="highlighter-rouge">Computer programming</code>:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">################################################################</span>
<span class="no">Chuck_Norris</span> <span class="n">is</span> <span class="mi">3</span> <span class="n">degrees</span> <span class="n">separated</span> <span class="n">from</span> <span class="no">Computer_programming</span><span class="o">.</span>
<span class="c1">################################################################</span>
<span class="sr">/wiki/</span><span class="err">Ch</span><span class="sr">u</span><span class="n">ck_Norris</span>
<span class="sr">/wiki/</span><span class="err">R</span><span class="sr">e</span><span class="n">publican_Party_</span><span class="p">(</span><span class="no">United_States</span><span class="p">)</span>
<span class="sr">/wiki/</span><span class="err">I</span><span class="sr">n</span><span class="n">ternal_Revenue_Service</span>
<span class="sr">/wiki/</span><span class="err">C</span><span class="sr">om</span><span class="n">puter_programming</span>
<span class="c1">################################################################</span>
</code></pre></div></div>
<p>The algorithm had to search 1113 links to find the answer, and we (programmers) are only 3 degrees separated from Chuck! Nice!</p>
<p>After having something working, I always try to see if I can improve it. I feel that making a BFS to add the nodes and then searching the path is making double work, because the BFS will find a minimum path in this case!</p>
<p>Since I don’t want to search, every node I put in the graph has to know the path I used to get to it, so let’s build this custom graph:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nn">Crawlers</span>
<span class="k">class</span> <span class="nc">Custom</span> <span class="o"><</span> <span class="no">Base</span>
<span class="k">def</span> <span class="nf">call</span>
<span class="n">queue</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">queue</span><span class="p">.</span><span class="nf">push</span><span class="p">(</span><span class="no">Node</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="vi">@from_path</span><span class="p">))</span>
<span class="n">answer</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">while</span> <span class="p">(</span><span class="n">current</span> <span class="o">=</span> <span class="n">queue</span><span class="p">.</span><span class="nf">shift</span><span class="p">)</span> <span class="o">!=</span> <span class="kp">nil</span>
<span class="k">return</span> <span class="n">current</span><span class="p">.</span><span class="nf">complete_articles_path</span> <span class="k">if</span> <span class="n">current</span><span class="p">.</span><span class="nf">article</span> <span class="o">==</span> <span class="vi">@to_path</span>
<span class="n">links</span> <span class="o">=</span> <span class="vi">@repository</span><span class="p">.</span><span class="nf">get_links</span><span class="p">(</span><span class="n">current</span><span class="p">.</span><span class="nf">article</span><span class="p">)</span>
<span class="k">if</span> <span class="n">links</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="vi">@to_path</span><span class="p">)</span>
<span class="k">return</span> <span class="p">(</span><span class="n">answer</span> <span class="o">=</span> <span class="n">current</span><span class="p">.</span><span class="nf">complete_articles_path</span> <span class="o"><<</span> <span class="vi">@to_path</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">links</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">article</span><span class="o">|</span>
<span class="k">unless</span> <span class="n">current</span><span class="p">.</span><span class="nf">previous_articles</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
<span class="n">queue</span><span class="p">.</span><span class="nf">push</span><span class="p">(</span><span class="no">Node</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">article</span><span class="p">,</span> <span class="n">current</span><span class="p">.</span><span class="nf">complete_articles_path</span><span class="p">))</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">answer</span>
<span class="k">end</span>
<span class="k">class</span> <span class="nc">Node</span>
<span class="nb">attr_reader</span> <span class="ss">:article</span><span class="p">,</span> <span class="ss">:previous_articles</span>
<span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">article</span><span class="p">,</span> <span class="n">previous_articles</span> <span class="o">=</span> <span class="p">[])</span>
<span class="vi">@article</span> <span class="o">=</span> <span class="n">article</span>
<span class="vi">@previous_articles</span> <span class="o">=</span> <span class="n">previous_articles</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">complete_articles_path</span>
<span class="vi">@previous_articles</span> <span class="o">+</span> <span class="p">[</span><span class="vi">@article</span><span class="p">]</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Now it’s time to benchmark both solutions and see which one is best:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Benchmarking 100 times
user system total real
Custom 63.772801 9.894999 73.667800 (249.561173)
Graph 197.834715 10.884686 208.719401 (378.234052)
Memory usage of Custom ---------------------------
Total allocated: 155287889 bytes (2223856 objects)
Total retained: 546 bytes (2 objects)
Memory usage of Graph ----------------------------
Total allocated: 259432021 bytes (2015009 objects)
Total retained: 97485111 bytes (517055 objects)
</code></pre></div></div>
<p>Wow! That’s actually a very big improvement in both speed and memory!</p>
<h2 id="conclusion">Conclusion</h2>
<p>Even though the result of solving this problem is not very useful, I had a lot of fun doing it. I was able to use the algorithms I’ve learned in a more “realistic” way (compared with the exercises found in books).</p>
<p>I hope you enojoyed this as much as I did! You can check the full implementation <a href="https://github.com/alebian/six-degrees-of-wikipedia">on my github account</a>!</p>Alejandro BezdjianA few years ago, I read an interesting article about the degrees of separation between Facebook users. Regardless of the result in the article, I thought it would be exciting to find the separation between other things, like Wikipedia articles.Infinite list of prime numbers using Python generators2019-09-14T00:00:00-03:002019-09-14T00:00:00-03:00https://blog.alebian.com/python/generators/primes/2019/09/14/infinite-prime-numbers<p>Sometimes we want to create collections of elements that are very expensive to calculate. The first option is to create a list and wait until all the elements are calculated before we use it. Although this works, it is not very efficient. To make it a bit more efficient, modern languages provide a way to create custom iterators so each element is calculated only when needed (this is also called lazy initialization). Also, iterators allows us to create infinite collections!</p>
<p>Python has, in my opinion, one of the most succint and elegant ways to declare iterators: <a href="https://wiki.python.org/moin/Generators">generators</a>.</p>
<p>Without further ado, let’s try to create an infinite list of prime numbers.</p>
<p>The first thing we need is a way to detect if a number is prime:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">math</span> <span class="kn">import</span> <span class="n">sqrt</span>
<span class="k">def</span> <span class="nf">is_prime</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o"><=</span> <span class="mi">1</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">==</span> <span class="mi">2</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">True</span>
<span class="k">if</span> <span class="p">(</span><span class="n">n</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="n">i</span> <span class="o">=</span> <span class="mi">3</span>
<span class="k">while</span> <span class="n">i</span> <span class="o"><=</span> <span class="n">sqrt</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="k">if</span> <span class="n">n</span> <span class="o">%</span> <span class="n">i</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">2</span>
<span class="k">return</span> <span class="bp">True</span>
</code></pre></div></div>
<p>Now, using our <code class="highlighter-rouge">is_prime</code> function we can do:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">prime_generator</span><span class="p">():</span>
<span class="n">n</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
<span class="n">n</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">if</span> <span class="n">is_prime</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="k">yield</span> <span class="n">n</span>
</code></pre></div></div>
<p>And that’s it! Just call the function and get elements from it:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">generator</span> <span class="o">=</span> <span class="n">prime_generator</span><span class="p">()</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="nb">next</span><span class="p">(</span><span class="n">generator</span><span class="p">))</span>
</code></pre></div></div>
<p>Or create a list of the first N prime numbers:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">islice</span>
<span class="n">array</span> <span class="o">=</span> <span class="p">[</span><span class="n">x</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">islice</span><span class="p">(</span><span class="n">prime_generator</span><span class="p">(),</span> <span class="mi">10</span><span class="p">)]</span>
</code></pre></div></div>
<p>As you can see, the iterator definition is one of the shortest and simplest among all languages.</p>Alejandro BezdjianSometimes we want to create collections of elements that are very expensive to calculate. The first option is to create a list and wait until all the elements are calculated before we use it. Although this works, it is not very efficient. To make it a bit more efficient, modern languages provide a way to create custom iterators so each element is calculated only when needed (this is also called lazy initialization). Also, iterators allows us to create infinite collections!Text editor tips and tricks to boost your productivity2019-09-11T00:00:00-03:002019-09-11T00:00:00-03:00https://blog.alebian.com/editor/productivity/vscode/tools/2019/09/11/text-editor-tips-and-tricks<p>Recently, I was peer programming at work and, without thinking, I started to use a few editor tricks I had learned a while ago. My co-worker was amazed with them and picked up a few. At first I thought it was no big deal, but a few days later I began to observe that there are a lot of people who don’t know many of these tricks. Since the tools we use have a lot of functionalities to make us more productive, it’s a shame that we spoil them. That’s why I thought it would be a good opportunity to transfer this useful knowledge and help others improve their productivity.</p>
<h2 id="building-blocks">Building blocks</h2>
<p>All the tricks I’ll show you are a combination of small pieces that seem insignificant at first sight, but are very powerful when they are combined. At the end I will show you some examples that (hopefully) show how useful these tricks are.</p>
<p>The examples were created using VS Code on a mac, but all these functionalities should be available in every editor and IDE.</p>
<h3 id="moving-cursor">Moving cursor</h3>
<p>Let’s start slowly but surely.</p>
<p>Use the arrows of your keyboard to move the cursor one step at a time:</p>
<p><img src="/assets/images/editor_tips/1_move_cursor.gif" alt="Move cursor one step at a time" /></p>
<p>Using <em>option + cursor</em> move the cursor between words:</p>
<p><img src="/assets/images/editor_tips/2_move_cursor.gif" alt="Move cursor between words" /></p>
<p>Use <em>cmd + cursor</em> to move the cursor to the end or beginning of line:</p>
<p><img src="/assets/images/editor_tips/3_move_cursor.gif" alt="Move cursor to the beginning of line" /></p>
<p>Using the <em>option</em> key, you can move any line:</p>
<p><img src="/assets/images/editor_tips/4_move_line.gif" alt="Move line" /></p>
<p>I use <em>cmd + x</em> to delete a line, this actually cuts it, but I don’t really care:</p>
<p><img src="/assets/images/editor_tips/5_delete_line.gif" alt="Delete line" /></p>
<h3 id="highlighting">Highlighting</h3>
<p>This also seems silly, but it’s important that we know them all.</p>
<p>Using <em>shift + cursor</em> we can highlight words character by character:</p>
<p><img src="/assets/images/editor_tips/6_highlight.gif" alt="Highlight characters" /></p>
<p>To improve this, we can use <em>shift + option + cursor</em> to highlight whole words:</p>
<p><img src="/assets/images/editor_tips/7_highlight.gif" alt="Highlight words" /></p>
<p>We can boost this using <em>shift + cmd + cursor</em> to select the entire line:</p>
<p><img src="/assets/images/editor_tips/8_highlight.gif" alt="Highlight line" /></p>
<h3 id="multiple-cursor-magic">Multiple cursor magic</h3>
<p>Now we are getting to the most important thing in my opinion, controlling multiple cursors.</p>
<p>First let’s create them, we can use <em>cmd + click</em> using the mouse or <em>cmd + option + up/down cursor</em> with the keyboard only:</p>
<p><img src="/assets/images/editor_tips/9_multi_line.gif" alt="Create multiple cursors" /></p>
<p>Normally, the selections of each cursor are copied separately and you can copy and paste them anywhere. If you have the same number of cursors as when you copied this is what happens:</p>
<p><img src="/assets/images/editor_tips/10_copy_multiple_elements.gif" alt="Copy and paste multiple cursors" /></p>
<p>If you have a smaller or larger number of cursors than when you copied, all the copied selections will be pasted on every cursor you have at the moment.</p>
<p>I use the next one a lot. When you select something, you can use <em>cmd + d</em> to select the next matching selection. This is extremely useful as we will see in one of the examples at the end. Each selection will create a cursor for it:</p>
<p><img src="/assets/images/editor_tips/11_similar_highlighted.gif" alt="Same selection cursor" /></p>
<h3 id="auto-closing-characters">Auto closing characters</h3>
<p>These is similar to the HTML auto closing tags function (that I also recommend) but with characters that have a closing pair, for example:</p>
<p><img src="/assets/images/editor_tips/12_double_quote.gif" alt="Double quotes" /></p>
<p><img src="/assets/images/editor_tips/13_square_brackets.gif" alt="Single quotes" /></p>
<p><img src="/assets/images/editor_tips/14_brackets.gif" alt="Curly brackets" /></p>
<p><img src="/assets/images/editor_tips/15_parenthesis.gif" alt="Parenthesis" /></p>
<h3 id="examples">Examples</h3>
<p>All the previous blocks will be useful in %90 of the times (more or less). The examples I’ll show you are simplified real case scenarios and I hope the animations are self-explanatory (if there is something unusual I will try to explain what I did).</p>
<p><img src="/assets/images/editor_tips/exercise_1.gif" alt="Exercise 1" /></p>
<p>Here I added a space at the beginning so I would get a cursor where I wanted (this is another trick that you learn from practice).</p>
<p><img src="/assets/images/editor_tips/exercise_2.gif" alt="Exercise 2" /></p>
<p><img src="/assets/images/editor_tips/exercise_3.gif" alt="Exercise 3" /></p>
<p><img src="/assets/images/editor_tips/exercise_4.gif" alt="Exercise 4" /></p>
<p><img src="/assets/images/editor_tips/exercise_5.gif" alt="Exercise 5" /></p>
<p><img src="/assets/images/editor_tips/exercise_6.gif" alt="Exercise 6" /></p>
<p>Of course there are other ways to solve them, your creativity is your limit!</p>
<h2 id="extras">Extras</h2>
<p>There are other features that editors have that you definitely should know:</p>
<ul>
<li>Find and open a file by name: in my case is <em>cmd + d</em> but other editors and IDEs have them mapped differently.</li>
<li>Use autocompletion: if you develop in Java or similar you are probably used to this, but editors have a lot of plugins for other languages that can help you.</li>
<li>Find words in a file using regexp. This one is a bit more difficult (because you have to know regular expressions) but is very useful from time to time:</li>
</ul>
<p><img src="/assets/images/editor_tips/exercise_7.gif" alt="Regexp selection" /></p>
<p>Once you found the text you want with your regexp, use <em>shift + cmd + L</em> to select it.</p>
<p>I hope you found these tricks useful and help you be more productive!</p>Alejandro BezdjianRecently, I was peer programming at work and, without thinking, I started to use a few editor tricks I had learned a while ago. My co-worker was amazed with them and picked up a few. At first I thought it was no big deal, but a few days later I began to observe that there are a lot of people who don’t know many of these tricks. Since the tools we use have a lot of functionalities to make us more productive, it’s a shame that we spoil them. That’s why I thought it would be a good opportunity to transfer this useful knowledge and help others improve their productivity.Pascal’s triangle in Ruby for fun2019-09-07T00:00:00-03:002019-09-07T00:00:00-03:00https://blog.alebian.com/ruby/math/fibonacci/programming/2019/09/07/pascals-triangle-in-ruby-for-fun<p>The other day I came across a <a href="https://medium.com/i-math/top-10-secrets-of-pascals-triangle-6012ba9c5e23">blog post</a> talking about the Pascal’s triangle and all of it’s interesting properties and I thought it would be fun to implement it using Ruby.</p>
<p>The Pascal’s triangle is a triangular array of the binomial coefficients, but you don’t have to calculate them in order to create the triangle because each row can be constructed using the previous one, like this:</p>
<p><img src="/assets/images/pascal/1_building.png" alt="Building each row" /></p>
<p>Forming something like this:</p>
<p><img src="/assets/images/pascal/2_example.png" alt="First six rows of the triangle" /></p>
<p>What amazes me is that this simple construction can be used to calculate a lot of interesting things. Even though there are more efficient ways to do the same calculations I thought it’d be fun to do it this way.</p>
<h2 id="pascals-triangle-itself">Pascal’s triangle itself</h2>
<p>First, we are going to need something that calculates the triangle, let’s create a class for this:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PascalTriangle</span>
<span class="k">def</span> <span class="nf">initialize</span>
<span class="vi">@triangle</span> <span class="o">=</span> <span class="p">[[</span><span class="mi">1</span><span class="p">]]</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">get_file</span><span class="p">(</span><span class="n">param</span><span class="p">)</span>
<span class="k">return</span> <span class="vi">@triangle</span><span class="p">[</span><span class="n">param</span><span class="p">]</span> <span class="k">if</span> <span class="vi">@triangle</span><span class="p">[</span><span class="n">param</span><span class="p">]</span>
<span class="n">previous_file</span> <span class="o">=</span> <span class="n">get_file</span><span class="p">(</span><span class="n">param</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="vi">@triangle</span> <span class="o"><<</span> <span class="n">calculate_new</span><span class="p">(</span><span class="n">previous_file</span><span class="p">)</span>
<span class="vi">@triangle</span><span class="p">[</span><span class="n">param</span><span class="p">]</span>
<span class="k">end</span>
<span class="kp">private</span>
<span class="k">def</span> <span class="nf">calculate_new</span><span class="p">(</span><span class="n">previous_file</span><span class="p">)</span>
<span class="n">current_file</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="p">(</span><span class="n">previous_file</span><span class="p">.</span><span class="nf">size</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)).</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">idx</span><span class="o">|</span>
<span class="k">next</span> <span class="k">if</span> <span class="n">idx</span> <span class="o">==</span> <span class="n">previous_file</span><span class="p">.</span><span class="nf">size</span> <span class="o">-</span> <span class="mi">1</span>
<span class="n">current_file</span> <span class="o"><<</span> <span class="n">previous_file</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span> <span class="o">+</span> <span class="n">previous_file</span><span class="p">[</span><span class="n">idx</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span>
<span class="k">end</span>
<span class="n">current_file</span> <span class="o"><<</span> <span class="mi">1</span>
<span class="n">current_file</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>I used a recursive function to create each row when needed and dynamic programming to store the intermediate results to make it faster for successive calls.</p>
<h2 id="fibonacci">Fibonacci</h2>
<p>Let’s start with my favourite application of the triangle, the Fibonacci sequence. Basically you can get the elements of the sequence doing:</p>
<p><img src="/assets/images/pascal/3_fibonacci.png" alt="Fibonacci" /></p>
<p>We can implement a method like this:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="vi">@triangle</span> <span class="o">=</span> <span class="no">PascalTriangle</span><span class="p">.</span><span class="nf">new</span>
<span class="k">def</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="k">return</span> <span class="mi">0</span> <span class="k">if</span> <span class="n">n</span> <span class="o"><=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">2</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="n">n</span><span class="p">).</span><span class="nf">reverse_each</span><span class="p">.</span><span class="nf">with_index</span> <span class="k">do</span> <span class="o">|</span><span class="n">n</span><span class="p">,</span> <span class="n">idx</span><span class="o">|</span>
<span class="n">coefficients</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">2</span><span class="p">)</span>
<span class="k">next</span> <span class="k">unless</span> <span class="n">coefficients</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span>
<span class="n">result</span> <span class="o">+=</span> <span class="n">coefficients</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span>
<span class="k">end</span>
<span class="n">result</span>
<span class="k">end</span>
<span class="p">(</span><span class="mi">1</span><span class="o">..</span><span class="mi">20</span><span class="p">).</span><span class="nf">map</span> <span class="p">{</span> <span class="o">|</span><span class="n">n</span><span class="o">|</span> <span class="n">fibonacci</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> <span class="p">}</span> <span class="c1">#=> [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]</span>
</code></pre></div></div>
<h2 id="binomial-coefficient">Binomial coefficient</h2>
<p>Each element of the triangle corresponds to a binomial coefficient:</p>
<p><img src="/assets/images/pascal/4_binomial.png" alt="Binomial coefficients" /></p>
<p>So it’s super easy to get the value:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">binomial_coefficient</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">k</span><span class="p">)</span>
<span class="n">file</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">file</span><span class="p">[</span><span class="n">k</span><span class="p">]</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="binomial-expansions">Binomial expansions</h2>
<p>Coefficients of the expansion of a binomial raised to a positive integer N appear in the Nth row of the Pascal’s triangle:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(x + y)^2 = x^2 + 2xy + y^2 = 1*x^2 + 2*xy + 1*y^2
</code></pre></div></div>
<p>With this not only wee can get the coefficients, but calculate <code class="highlighter-rouge">(x+y)^n</code>:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">binomial_power</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="n">coefficients</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">coefficients</span><span class="p">.</span><span class="nf">each_with_index</span> <span class="k">do</span> <span class="o">|</span><span class="n">coefficient</span><span class="p">,</span> <span class="n">idx</span><span class="o">|</span>
<span class="n">result</span> <span class="o">+=</span> <span class="n">coefficient</span> <span class="o">*</span> <span class="n">a</span><span class="o">**</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="n">idx</span><span class="p">)</span> <span class="o">*</span> <span class="n">b</span><span class="o">**</span><span class="n">idx</span>
<span class="k">end</span>
<span class="n">result</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="powers-of-2">Powers of 2</h2>
<p>If we sum each number of the Nth row of the triangle we get <code class="highlighter-rouge">2^n</code>!</p>
<p><img src="/assets/images/pascal/5_2n.png" alt="Powers of 2" /></p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">power_of_2</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">coefficients</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">coefficients</span><span class="p">.</span><span class="nf">sum</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="powers-of-11">Powers of 11</h2>
<p>Here is a more complicated one. We can build the powers of 11 concatenating each number of a row.</p>
<p><img src="/assets/images/pascal/6_11n.png" alt="Powers of 11" /></p>
<p>Things get more complicated when the numbers start to get bigger. So we need to carry the tens place over to the number on it’s left:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">power_of_11</span><span class="p">(</span><span class="n">param</span><span class="p">)</span>
<span class="n">coefficients</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="n">param</span><span class="p">)</span>
<span class="k">if</span> <span class="n">param</span> <span class="o"><=</span> <span class="mi">4</span>
<span class="n">coefficients</span><span class="p">.</span><span class="nf">join</span><span class="p">.</span><span class="nf">to_i</span>
<span class="k">else</span>
<span class="n">coefficients_with_carry</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">coefficients</span><span class="p">.</span><span class="nf">reverse_each</span><span class="p">.</span><span class="nf">with_index</span> <span class="k">do</span> <span class="o">|</span><span class="n">coefficient</span><span class="p">,</span> <span class="n">idx</span><span class="o">|</span>
<span class="n">coefficient_with_carry</span> <span class="o">=</span> <span class="n">coefficient</span> <span class="o">+</span> <span class="n">coefficients_with_carry</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span>
<span class="k">if</span> <span class="n">coefficient_with_carry</span> <span class="o"><</span> <span class="mi">10</span>
<span class="n">coefficients_with_carry</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span> <span class="o">=</span> <span class="n">coefficient_with_carry</span>
<span class="n">coefficients_with_carry</span><span class="p">[</span><span class="n">idx</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">else</span>
<span class="n">coefficients_with_carry</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span> <span class="o">=</span> <span class="n">coefficient_with_carry</span> <span class="o">%</span> <span class="mi">10</span>
<span class="n">coefficients_with_carry</span><span class="p">[</span><span class="n">idx</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">coefficient_with_carry</span> <span class="o">/</span> <span class="mf">10.0</span><span class="p">).</span><span class="nf">floor</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">coefficients_with_carry</span><span class="p">.</span><span class="nf">reverse</span><span class="p">.</span><span class="nf">join</span><span class="p">.</span><span class="nf">to_i</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="series">Series</h2>
<p>We can find some series of numbers in the triangle</p>
<h3 id="perfect-squares">Perfect squares</h3>
<p>Perfect squares are numbers that can be expressed as the product of two equal integers, for example 4 is a perfect square becase you can express it like <code class="highlighter-rouge">2^2 = 4</code>. The perfect squares are found in the third column of the triangle, the trick is that you have to sum the element of the previous row:</p>
<p><img src="/assets/images/pascal/7_series.png" alt="Perfect squares" /></p>
<p>We can create a class that returns all of the perfect squares one by one:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">PerfectSquaresSeries</span>
<span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">triangle</span><span class="p">)</span>
<span class="vi">@triangle</span> <span class="o">=</span> <span class="n">triangle</span>
<span class="vi">@current_file</span> <span class="o">=</span> <span class="mi">3</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">next</span>
<span class="n">previous_file</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="vi">@current_file</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">file</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="vi">@current_file</span><span class="p">)</span>
<span class="vi">@current_file</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">file</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">+</span> <span class="n">previous_file</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">series</span> <span class="o">=</span> <span class="no">PerfectSquaresSeries</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="vi">@triangle</span><span class="p">)</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 4</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 9</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 16</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 25</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 36</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 49</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 64</span>
<span class="n">series</span><span class="p">.</span><span class="nf">next</span> <span class="c1">#=> 81</span>
</code></pre></div></div>
<h3 id="natural-numbers">Natural numbers</h3>
<p>If we take a look at the second column we see that the natural numbers appear:</p>
<p><img src="/assets/images/pascal/8_natural.png" alt="N-hedral numbers" /></p>
<p>This is not something very interesting, but if we see the succesive columns we observe the triangular, tetrahedral, pentalope numbers and so on (which I generalized calling them the N-hedral numbers).</p>
<h3 id="n-hedral-numbers">N-hedral numbers</h3>
<p>All the series can be found in the Nth column of the triangle, and we can get them like this:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">NHedralSeries</span>
<span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">triangle</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span>
<span class="vi">@triangle</span> <span class="o">=</span> <span class="n">triangle</span>
<span class="vi">@current_file</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span>
<span class="vi">@n</span> <span class="o">=</span> <span class="n">n</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">next</span>
<span class="n">file</span> <span class="o">=</span> <span class="vi">@triangle</span><span class="p">.</span><span class="nf">get_file</span><span class="p">(</span><span class="vi">@current_file</span><span class="p">)</span>
<span class="vi">@current_file</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">file</span><span class="p">[</span><span class="vi">@n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">natural</span> <span class="o">=</span> <span class="no">NHedralSeries</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="vi">@triangle</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">10</span><span class="p">).</span><span class="nf">map</span> <span class="p">{</span> <span class="n">natural</span><span class="p">.</span><span class="nf">next</span> <span class="p">}</span> <span class="c1">#=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]</span>
<span class="n">triangular</span> <span class="o">=</span> <span class="no">NHedralSeries</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="vi">@triangle</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">10</span><span class="p">).</span><span class="nf">map</span> <span class="p">{</span> <span class="n">triangular</span><span class="p">.</span><span class="nf">next</span> <span class="p">}</span> <span class="c1">#=> [1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66]</span>
<span class="n">tetrahedral</span> <span class="o">=</span> <span class="no">NHedralSeries</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="vi">@triangle</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">10</span><span class="p">).</span><span class="nf">map</span> <span class="p">{</span> <span class="n">tetrahedral</span><span class="p">.</span><span class="nf">next</span> <span class="p">}</span> <span class="c1">#=> [1, 4, 10, 20, 35, 56, 84, 120, 165, 220, 286]</span>
<span class="n">pentalope</span> <span class="o">=</span> <span class="no">NHedralSeries</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="vi">@triangle</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">10</span><span class="p">).</span><span class="nf">map</span> <span class="p">{</span> <span class="n">pentalope</span><span class="p">.</span><span class="nf">next</span> <span class="p">}</span> <span class="c1">#=> [1, 5, 15, 35, 70, 126, 210, 330, 495, 715, 1001]</span>
</code></pre></div></div>
<p>I hope you enjoyed this exercise as much as I did! You can find my complete implementation <a href="https://gist.github.com/alebian/654be128d39ea819ea89f6fdd48e301f">here</a>.</p>Alejandro BezdjianThe other day I came across a blog post talking about the Pascal’s triangle and all of it’s interesting properties and I thought it would be fun to implement it using Ruby.Automatic API documentation using Rails2016-05-10T00:00:00-03:002016-05-10T00:00:00-03:00https://blog.alebian.com/ruby/rails/webdev/2016/05/10/automatic-documentation<p>The technological trend in recent times is to store the user’s information in the cloud so that they can access it from any mobile, wearable or desktop device. To achieve this, you need to have some kind of API hosted on a server that feeds the apps, on this post we are going to focus on REST API. This way, we are able to have the business logic and data on the server side.</p>
<p>At work, we usually build web and mobile platforms for our clients using technologies such as Angular, Android, and iOS, and we use Ruby on Rails for the backend API. As you can imagine, for a project requiring all these technologies, several specialized teams are needed in order to provide the best possible service. This workflow creates a <strong>strong and inevitable dependence between the API and the rest of the technologies</strong>. Any change made to the API is automatically reflected on the devices. If the team is working on the first version of the API (or if it doesn’t have versioning) and these changes are important, they may even cause the apps to stop working or have errors that affect usability. While it is true that the contract that provides the API should not change, there are situations that make these changes unavoidable, and thus, <strong>communication between the teams is essential</strong>.</p>
<p>One of the best ways the API team can communicate with the other teams is through the endpoints <strong>documentation</strong>. It is useful both for the rest of the development teams and for potential external consumers. It is also practical for the API development team itself as it saves a lot of time spent answering questions about each of the resources presented.</p>
<p>All developers have suffered at one point due to poor documentation. What defines poor documentation? Mainly, it’s the lack of updates and clarity on how the service should be used. How many times have we seen documentation that does not detail the headers required for the request? Or that showed a sample response that does not match reality? Or other cases in which the documented endpoint no longer exists or has changed its path?</p>
<p>The only ones responsible for this are the API developers themselves who, although aware that it is necessary to have good documentation, also know that this task is <strong>tedious</strong> and <strong>time-consuming</strong>. So, good documentation is something difficult to have when you’re under pressure due to client’s deadlines and the speed of the market. Therefore, documenting is often a process that is not prioritized and even if it is, it is very likely that it will become <strong>deprecated</strong> quickly. Consequently, it’s delayed or made quickly (almost like a rough draft of some sort), while other features of the system are on the focus.</p>
<p>Many tools have emerged over time to simplify the task. <a href="https://apiary.io/">Apiary</a>, for example, allows making a mock of all endpoints and their responses, giving the option for the mobile applications to consume these fake resources without having to deploy them to the real API. It is very useful because app development is not delayed if the API team is having trouble delivering on time, and can also be used as documentation because you can add descriptions to the endpoints, clearly showing the way it is used and the response received.</p>
<p>However, since the data is mocked, any changes made in the implementation must be quickly changed in the documentation so as to keep it updated. The biggest problem this has is that some changes, such as a serializer change, are replicated in many endpoints. It may even be that the developer itself is <strong>not aware of the extent</strong> of this change if they are working with a fairly large API, so, although this tool has many advantages, it is very susceptible to obsolescence.</p>
<p>There are other tools that require the API developers to write extra code (usually some sort of comment) above the method that they want to document. You usually list its description, parameters, response example and sometimes more. This is the traditional way to document in languages such as Java. These tools create beautiful HTML documents, but have certain disadvantages: if the method response can be defined, developers must keep the information updated, not to mention that the API could return a very large JSON or XML, causing the controller to have many lines of comments and/or code for documentation. All those lines added makes the code more complex and harder to read. The documentation can even be longer than the code itself!</p>
<p>Tools that try to shed some light into this issue are starting to appear. These are the ones that generate documentation through the tests. A robust and good quality API <strong>MUST have tests</strong>, and such tests should cover most common use cases (at least). That’s why the information used for these tests may be helpful in understanding the operation of the API and would be very useful to be able to extract it <strong>automatically</strong>. So, addressing the problem this way, we can have all the information updated every time the tests are run! And furthermore, we are <strong>forced to have a good amount of API use case tests</strong>. A WIN-WIN situation. The tools used when applying this methodology have a small disadvantage: They create a kind of DSL to be able to use it, increasing the developer’s learning curve.</p>
<p>That’s why I decided to create Dictum, a tool with these characteristics for Rails that is <strong>easy</strong> to use and <strong>powerful</strong> enough to let you <strong>customize</strong> it any way you want!</p>
<p>So far, it creates documentation in markdown and HTML formats. Let’s see a short example:</p>
<p>First you have to add the configuration file:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># /config/initializers/dictum.rb</span>
<span class="no">Dictum</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">config</span><span class="o">|</span>
<span class="n">config</span><span class="p">.</span><span class="nf">output_path</span> <span class="o">=</span> <span class="no">Rails</span><span class="p">.</span><span class="nf">root</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s1">'docs'</span><span class="p">)</span>
<span class="n">config</span><span class="p">.</span><span class="nf">root_path</span> <span class="o">=</span> <span class="no">Rails</span><span class="p">.</span><span class="nf">root</span>
<span class="n">config</span><span class="p">.</span><span class="nf">output_filename</span> <span class="o">=</span> <span class="s1">'Documentation'</span>
<span class="n">config</span><span class="p">.</span><span class="nf">output_format</span> <span class="o">=</span> <span class="ss">:markdown</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Then you can customize your Rspec’s after(:each) hook like this:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># spec/support/spec_helper.rb</span>
<span class="no">RSpec</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">config</span><span class="o">|</span>
<span class="n">config</span><span class="p">.</span><span class="nf">after</span><span class="p">(</span><span class="ss">:each</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="nb">test</span><span class="o">|</span>
<span class="k">if</span> <span class="nb">test</span><span class="p">.</span><span class="nf">metadata</span><span class="p">[</span><span class="ss">:dictum</span><span class="p">]</span>
<span class="no">Dictum</span><span class="p">.</span><span class="nf">endpoint</span><span class="p">(</span>
<span class="ss">resource: </span><span class="nb">test</span><span class="p">.</span><span class="nf">metadata</span><span class="p">[</span><span class="ss">:described_class</span><span class="p">].</span><span class="nf">to_s</span><span class="p">.</span><span class="nf">gsub</span><span class="p">(</span><span class="s1">'Controller'</span><span class="p">,</span> <span class="s1">''</span><span class="p">),</span>
<span class="ss">endpoint: </span><span class="n">request</span><span class="p">.</span><span class="nf">fullpath</span><span class="p">,</span>
<span class="ss">http_verb: </span><span class="n">request</span><span class="p">.</span><span class="nf">env</span><span class="p">[</span><span class="s1">'REQUEST_METHOD'</span><span class="p">],</span>
<span class="ss">description: </span><span class="nb">test</span><span class="p">.</span><span class="nf">metadata</span><span class="p">[</span><span class="ss">:dictum_description</span><span class="p">],</span>
<span class="ss">request_body_parameters: </span><span class="n">request</span><span class="p">.</span><span class="nf">env</span><span class="p">[</span><span class="s1">'action_dispatch.request.parameters'</span><span class="p">],</span>
<span class="ss">response_status: </span><span class="n">response</span><span class="p">.</span><span class="nf">status</span><span class="p">,</span>
<span class="ss">response_body: </span><span class="n">response</span><span class="p">.</span><span class="nf">body</span>
<span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>After that, tell Dictum which tests are you going to document:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># spec/controllers/my_resource_controller_spec.rb</span>
<span class="nb">require</span> <span class="s1">'spec_helper'</span>
<span class="n">describe</span> <span class="no">MyResourceController</span> <span class="k">do</span>
<span class="no">Dictum</span><span class="p">.</span><span class="nf">resource</span><span class="p">(</span><span class="ss">name: </span><span class="s1">'MyResource'</span><span class="p">,</span> <span class="ss">description: </span><span class="s1">'This is MyResource description.'</span><span class="p">)</span>
<span class="n">describe</span> <span class="s1">'#some_method'</span> <span class="k">do</span>
<span class="n">context</span> <span class="s1">'some context of my resource'</span> <span class="k">do</span>
<span class="n">it</span> <span class="s1">'returns status ok'</span><span class="p">,</span> <span class="ss">dictum: </span><span class="kp">true</span><span class="p">,</span> <span class="ss">dictum_description: </span><span class="s1">'Some description of the endpoint.'</span> <span class="k">do</span>
<span class="n">get</span> <span class="ss">:index</span>
<span class="n">expect</span><span class="p">(</span><span class="n">response_status</span><span class="p">).</span><span class="nf">to</span> <span class="n">eq</span><span class="p">(</span><span class="mi">200</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>And finally run: <code class="highlighter-rouge">bundle exec rake dictum:document</code></p>
<p>That was really simple wasn’t it? You can read the gem’s <a href="https://github.com/alebian/dictum/blob/master/README.md">README</a> if you need more information.</p>
<p>The gem is in a very early stage and has a long way to go, so feel free to report issues or make pull requests if you liked it!</p>Alejandro BezdjianThe technological trend in recent times is to store the user’s information in the cloud so that they can access it from any mobile, wearable or desktop device. To achieve this, you need to have some kind of API hosted on a server that feeds the apps, on this post we are going to focus on REST API. This way, we are able to have the business logic and data on the server side.