Core Elixir: Collection to List

First, we need to define some terms: Elixir Collections include things like HashDicts, Tuples, and Lists. A List is a very specific type of collection: It’s a singly linked list, basically. It has an order, which the other collections don’t have.

If you want to do anything with a list, it’s expensive. You need to traverse the whole list before you can do much with it. You can’t get to the nth member of the list without going through the first n-1 members to get there. There’s no direct references to individual members of the list. No indices. No extra pointers.

The Enum module works with collections. It takes the place of the various types of loops you might write in other languages.

Lists, in particular, have their own module, cleverly named List, that handles functions that make sense only to lists and not collections as a whole. You can flatten a list there, find an item in a specific position, or fold a list to the left or right, for four examples.

So, to sum it up:

  • Lists are a special form of Collections.
  • Enum deals with Collections.
  • List deals with list-specific things an Enum isn’t appropriate for.

With that out of the way:

Conversion Therapy

Can we convert a collection to a list? Of course we can! We use Enum.to_list/1.

The crazy thing is how that function does it.

Remember that with Elixir it’s easy to separate the head (first value) from the tail (everything afterwards) of a list. It’s not easy to do the reverse and break the last value off the rest of the list. (The List.last function does this by traversing its way all the way to the end in tail-recursive style.) You need to always move from the front to the back of a list; performance can be rather dismal, particularly as your list gets bigger.

There’s no direct way of converting a collection into a list. You can’t wave a magic wand and have it happen. Instead, you trick Elixir into looping over every value in the collection and adding those values to a list. As it happens, that’s a side effect for one of the Enum functions! How convenient.

Thinking Backwards

The Elixir Enum.reverse function takes in a collection and reverses it, making it a list in the process. It uses reduce and everything. Look:

  def reverse(collection, tail) do
    reduce(collection, to_list(tail), fn(entry, acc) ->
      [entry|acc]
    end)
  end

That reduce statement goes item by item in the collection and keeps putting the next item at the head of a new list. See that line perfectly in the middle of the code? [entry|acc] is forming that list. acc is the list so far (the accumulator), and entry is the head of the remaining collection that gets stapled into front of the list.

When all is said and done, you have a list of elements that came out of a collection.

The new list it creates, however, is in the reverse of the order the collection started with, since you’re always placing the next element ahead of all the others so far.

For example, we’ll take a new Map (which is a collection), give it some values, and see what happens when we Enum.reverse it:

iex> m = Map.new()
%{}
iex> m = Map.put_new(m, :a, 1)
%{a: 1}
iex> m = Map.put_new(m, :b, 2)
%{a: 1, b: 2}
iex> m = Enum.reverse(m)
[b: 2, a: 1]

(Yes, I know the Enum.into trick, but for clarity’s sake, I’m spelling this out long hand. Maybe we’ll talk about Enum.into in the future…)

Look at that: You have a list now. You can tell because it’s surrounded in brackets and not a percent sign and curly braces. Or, you can tell programmatically:

iex> is_list(m)
true

Elixir’s List module doesn’t contain a reverse function. That’s because the List module only contains functions that wouldn’t make sense as an Enum function.

Up to this point, you’ve done two things at the same time: Converted a collection to a list, and reversed the order. That’s more than you wanted to do, though. You just wanted the conversion part, not the reversal part.

Since you just wanted to convert the collection to a list, you need to reverse the list back into its original order. Since it’s a list now, it makes sense to use the list version of reverse. On the off chance you skipped what I wrote two paragraphs ago: There is no Elixir List version of the reverse function.

We do the next best thing: we call directly out to Erlang’s.

That follows Elixir’s standards. Per the List module documentation:

A decision was taken to delegate most functions to Erlang’s standard library but follow Elixir’s convention of receiving the target (in this case, a list) as the first argument.

In any case, whether you do the second reversal with the Enum or :lists library, it will work, since we’re dealing with a list at that point, either way:

iex> Enum.reverse(%{a: 1, b: 2}) |> Enum.reverse
[a: 1, b: 2]
iex> Enum.reverse(%{a: 1, b: 2}) |> :lists.reverse
[a: 1, b: 2]

But the source code goes with the latter.

That’s 800 words to describe what is a one-liner in Core Elixir:

  def to_list(collection) do
    reverse(collection) |> :lists.reverse
  end

But we’re not done yet!

The Easier Way (Doesn’t Work)

05 August 2015: Major updates to change the example code in this section to be the same as the Maps-based example above. A new section has been added right afterwards, as well.

Wait, isn’t there a way to traipse across the collection one item at a time and not reverse the subsequent list?

What if we took the Enum.reverse function and rewrote it?

Here again is what the key part of it looks like today:

  def reverse(collection, tail) do
    reduce(collection, to_list(tail), fn(entry, acc) ->
      [entry|acc]
    end)
  end

As an exercise, try flip-flopping the [entry|acc] to give you [acc|entry]. Don’t you think that might stop the loop over the collection from returning results in reverse order?

Sorta, but the results are not pretty:

[[[] | {:a, 1}] | {:b, 2}]

If you tease that apart, the first item in the list (the head) is a list of an empty list with a tail of {:a, 1}.

The reverse/2 call is fronted by this reverse/1 call:

  def reverse(collection) do
    reverse(collection, [])
  end

The programmer just sends in a collection and doesn’t worry about it. Let the library worry about the accumulator. And, here, it’s seeded as an empty list, [].

Since we started the recursion with [] as the list, that starts as the head with {:a, 1} as the tail. Then, that whole construct becomes the new head, while the next value, {:b, 2} becomes the tail of a list where the head is an empty list and {:a, 1}.

If we extended the original map out a little bit, the new list looks even wonkier. (I rewrote the reverse function in a new module I named after myself. I’m not just an egomaniac, but I am very fast at typing my own name.)

iex> Augie.reverse(%{a: 1, b: 2, c: 3, d: 4, e: 5, f: 6}) 
[[[[[  [[] | {:a, 1}] | {:b, 2}] | {:c, 3}] | {:d, 4}] | {:e, 5}] | {:f, 6}]          

This is the head of that list:

[[[[[[] | {:a, 1}] | {:b, 2}] | {:c, 3}] | {:d, 4}] | {:e, 5}]

And the tail:

{:f, 6}

So I suppose you could do a weird kind of reverse recursion where you keep dealing with the tail and passing the head along. You’d process the list backwards. And since that would go against every bit of conventional wisdom in programming, it’s probably safe to ignore that. It also doesn’t bring us closer to converting a collection to a list without the second reverse.

This is how the new list is then constructed:

[ [] | {:a, 1} ]
[ [ [] | {:a, 1}] | {:b, 2}]
[ [ [ [] | {:a, 1}] | {:b, 2}] | {:c, 3}]
[ [ [ [ [] | {:a, 1}] | {:b, 2}] | {:c, 3}] | {:d, 4} ]
etc. 

I stopped there before you got dizzy from the brackets and pipes. I’ve spent my whole career avoiding Lisp. This is getting perilously too close.

Note the distinct lack of commas in there. You can’t flatten that list if you tried. And, yes, I tried. Because I’m thorough:

iex> list = [[[[[[[[] | {:a, 1}] | {:b, 2}] | {:c, 3}] | {:d, 4}] | {:e, 5}] | {:f, 6}], {:g, 7}]
iex> List.flatten(list)
** (FunctionClauseError) no function clause matching in :lists.do_flatten/2
    (stdlib) lists.erl:625: :lists.do_flatten(9, '\n')
    (stdlib) lists.erl:626: :lists.do_flatten/2

You break Erlang with that crazy request… Congratulations.

New on August 5, 2015 – You CAN Do It!

I received an email from a reader, Roman, who made a smart suggestion to help fix this. Instead of [ acc | entry ], try [ acc | [entry] ]. The system is expecting a list after the pipe “|”, so give it one.

The new reverse function looks like this:

defmodule Augie do
    def reverse(collection, tail \\ []) do
        Enum.reduce(collection, Enum.to_list(tail), fn(entry, acc) ->
            [acc|[entry]]
        end
    end
end

Look at the big difference that gives us in results, before and after:

iex> Augie.reverse(%{a: 1, b: 2, c: 3, d: 4, e: 5, f: 6}) # [ acc | entry ]
[[[[[[[] | {:a, 1}] | {:b, 2}] | {:c, 3}] | {:d, 4}] | {:e, 5}] | {:f, 6}]

iex> Augie.reverse(%{a: 1, b: 2, c: 3, d: 4, e: 5, f: 6})  # [ acc | [entry] ]
[[[[[[[], {:a, 1}], {:b, 2}], {:c, 3}], {:d, 4}], {:e, 5}], {:f, 6}]

All of those pipes for the head/tails separators have been replaced by glorious commas. Now you can flatten it:

iex> Augie.reverse(%{a: 1, b: 2, c: 3, d: 4, e: 5, f: 6}) |> List.flatten
[a: 1, b: 2, c: 3, d: 4, e: 5, f: 6]

Doesn’t that look prettier now? And you don’t need to reverse it anymore, either!

So why not go this way? It’s slower.

Just as a down and dirty test, I ran these two lines in iex to see how many milliseconds it would take to create a list with 100,000 entries and create a flat version of it. I used the Erlang :os.timestamp function to grab the time before and after each calculation. Even with the extra step, the current Enum.reverse clearly wins:

{_,_,c} = :os.timestamp; 1..100000 |> Augie.reverse |> List.flatten;  {_, _, c1} = :os.timestamp; IO.puts c1 - c;

{_,_,c} = :os.timestamp; 1..100000 |> Enum.reverse |> List.flatten |> :lists.reverse; {_, _, c1} = :os.timestamp; IO.puts c1 - c;

The results are never identical, but the range for the current Enum.reverse solution sits somewhere in the 20,000 – 23,000 microseconds range, while the Augie.reverse solution ranges between 24,000 and 30,000 microseconds.

Thanks again to Roman for pointing this out. It’s a good reminder to keep more proper lists…

The Anti-Climax

This is not an essay that ends with a brilliant pull request to convert a collection into a list in one less step with moderate gains in speed and performance.

I don’t have an answer to this. Honestly, this isn’t a problem that needs a solution. That’s not why I started writing this one. This is about finding out how Elixir works behind the scenes. What are the quirks of the language? Where does it hand things off to Erlang? What would be useful for you to know as a programmer?

Sometimes, it’s a winding path that goes in circles as we blindly grope for an answer. Or an explanation.

It’s that little kid’s tendency to ask “Why?” constantly that drives this series.

Even when we hit the bottom without a clickbait twist to put in the headline.

It’s just plain old Elixir. And it’s lots of fun.

Did you ever think the way to convert one data type to another is to use two functions that do completely unrelated things to the task at hand, but do the same thing logistically in two different languages? Crazy, right?

Post Script: Did You Know?

:lists.reverse has two different versions in Erlang. You have your pick between sending one argument or two. What’s the difference? The first argument in both cases is the list you’re looking to turn around.

When the arity is 2, though, the second argument is another list that will be added to the end of your reversed list, just in case you need that kind of thing:

iex> :lists.reverse([1,2,3,4,5])
[5, 4, 3, 2, 1]
iex> :lists.reverse([1,2,3,4,5],[100,200,300])
[5, 4, 3, 2, 1, 100, 200, 300]

Note that the second list doesn’t get reversed.

If your second argument isn’t a list, it becomes the list’s new tail:

iex> :lists.reverse([1,2,3,4,5],1001)
[5, 4, 3, 2, 1 | 1001]

(Again, don’t try to run flatten on that. It doesn’t work that way. Weren’t you paying attention 500 words ago?!?)

Elixir has the same thing. When you run Enum.reverse against a collection, it actually calls Enum.reverse/2, with an empty list as the the tail.

But if you call Enum.reverse/2 on purpose with some list to add to the end of a collection, then you’re actually using an optimization in the language. Elixir could just reverse the collection and then append the tail to it like this:

Enum.concat(Enum.reverse(collection), tail)

Instead, Elixir pulls out Yet Another Reduce function:

reduce(collection, to_list(tail), fn(entry, acc) ->
      [entry|acc]
    end)

Really, is there anything that reduce can’t do?

Summing it all up

If you feel the need to convert a collection to a list, just reverse it. Twice. Once in Elixir, once in Erlang.

For extra credit, pin a tail on it afterwards.

If you have any comments, questions, complaints, criticisms, or corrections, catch me on Twitter, @AugieDB. Or make a pull request on Github! That Twitter handle and Github ID is the same as my GMail account, if you want to deal with it more quietly. I want these articles to be factually correct and will update them as necessary.

(8)

Forgive Me

I have failed the religion of functional programming this week.

I attempted to loop over a list to populate a map using Enum.map.

When that didn’t work, I attempted a list comprehension.

When that didn’t work, I blamed scoping issues with closures.

And then, just when I was ready to dive off a cliff, I realized the error of my ways.

I used recursion.

It solved all my problems, including one nasty function where I used five temporary variables just to do a proper match. (Honest, I always planned to refactor that at some point…)

Confession is good for the soul, so I seek absolution in the most public of ways: Blogging.

(I already tweeted about it earlier tonight, so where else did I have to go?)

I have begun a six step process to grieve for these unearthly sins. I think the sixth step is to publish the code as a gist or something. A before and after. We’ll see if my ego grants me that privilege by the end of the week…

Kids, use recursion. You can thank me later.

The Wavelengths of My Life

I wrote this last week while attending OSCon 2015 in Portland, OR.

This week, I have the following items on my person at nearly all times:

  • A lanyard that includes an RFID chip that gets scanned when I walk into conference rooms.

  • A keyfob for my car rental. It’s one of those keyless cars, so I keep it in my pocket and can un-lock the door just by pulling on the handle, and start it by pushing a button.

  • My hotel keycard, which I tap on the door to unlock.

  • My Continuous Glucose Monitor, which sends a low-power radio signal every five minutes to —

  • My insulin pump, which receives the CGM signals and also occasional signals from —

  • My blood glucose monitor, which is in my backpack.

  • My cell phone, with Blu Tooth turned on for proper podcast listening in the car and calls home. Plus, a cellular radio in constant contact with a series of towers across the area.

If I turn green when I get home, this is why.

The Best Videos of OSCon 2015

O’Reilly doesn’t give away the videos of the presentations at OSCon, ironically enough. You have to buy them as a package.

They do, however, live cast the keynotes on a daily basis to the internet and post those videos on-line immediately.

Here are my favorites:

“Situation Normal, Everything Must Change” by Simon Wardley

“The Future is Awesome” by Paul Fenwick

“Making Architecture Matter” by Martin Fowler

“Change-Making at the Largest Public Interest Startup” by Mikey Dickerson

If you can only watch one, I’d go with Simon Wardley’s…

Core Elixir: IO.Puts

Stand back. We’re going to wrestle some dangerous concepts to the ground, though no processes will actually appear…

The string you’re IO.puts ing can be interpolated. That’s computer science speak for “the language will replace variables with their values inside the quotation marks.”

Like Ruby, Elixir uses #{} for this:

iex> str = 'world'

iex> IO.puts("Hello, #{str}")   # Double quoted strings
Hello, world
:ok

iex> IO.puts('Hello, #{str}')   # Single quoted strings
Hello, world
:ok

In Elixir, the interpolation will happen whether you use single or double quotes. You’ll see I tried it both ways there. The difference in quotation styles is used only in dealing with Elixir strings versus Erlang strings, and we’ll get to that at some point in the nebulous future once the Advil kicks in from getting that terminology right.

You’ll notice that Elixir threw in a new line at the end of the string for us. That’s part of the service IO.puts offers. In other languages, there’s a different command for when you want to use or not use the newline. In Perl, there’s print and say. In Ruby there’s puts and print. In both cases, the print leaves off the newline.

In Elixir, IO.write will not add the newline and print the string as is.

This can look awkward in the iex REPL:

iex> IO.write('Hello, #{str}')
Hello, world:ok

Both IO.puts and IO.write allow you to send your output to different locations, such as standard error. We’ll cover that more in a future article on IO.puts that’ll blow your mind! (When it comes to managing expectations, I’m a failure.)

But, you may ask yourself if you’re bored one day, how does interpolation work?

How Does Interpolation Work?

I’m glad you asked. It’s sorta like polymorphism. Yes, welcome to that rabbit hole…

The first trick is that Elixir sees #{} and rewrites it internally as to_string and then concatenates it with its surroundings.

“Hello, #{str}” becomes “‘Hello, ‘ <> to_string(str)”.

to_string is in the Kernel module (and is thus automatically imported) and adheres to the Strings.Chars protocol.

Any module that follows that protocol must implement a to_string function. Both strings and integers do. More complicated types like tuples do not.

A list will work:

iex> sample_list =['a','b','c','d']
['a', 'b', 'c', 'd']

iex> IO.puts('Hello, #{sample_list}')
Hello, abcd
:ok

A tuple will not:

iex> t = {1, 'a'}
{1, 'a'}

iex> IO.puts('Hello, #{t}')
** (Protocol.UndefinedError) protocol String.Chars not implemented for {1, 'a'}
    (elixir) lib/string/chars.ex:3: String.Chars.impl_for!/1
    (elixir) lib/string/chars.ex:17: String.Chars.to_string/1

In this case, you can work around things by specifying inspect instead of relying on Elixir to use to_string:

iex> IO.puts('Hello, #{inspect t}')
Hello, {1, 'a'}
:ok

Inspect This

That looks a little magical, until you dig a little further and realize how it works:

iex> IO.puts('Hello, #{Kernel.inspect(t)}')
Hello, {1, 'a'}
:ok

inspect, like to_string, is not a magic keyword. It’s a function in the Kernel module. The Kernel module is automatically loaded every time you run Elixir.

If you’re not far enough down the rabbit hole yet, look closer at Kernel.inspect/2. Yes, that’s right: an arity of 2. That second parameter is a list of options that become an Inspect.Opts struct, which is part of the Inspect protocol.

And, yes, we’re going there next.

Inspect.Opts gives you some interesting choices. For example, you can limit the number of values returned. For example:

iex> IO.puts('Hello, #{Kernel.inspect(t, [limit: 1])}')
Hello, {1, ...}
:ok

More drastically:

iex> listicle = ['a', 'b', 'c', 'd', 'e', 'f']
['a', 'b', 'c', 'd', 'e', 'f']
iex> IO.puts('Hello, #{Kernel.inspect(listicle, [limit: 1])}')
Hello, ['a', ...]
:ok

Seems kind of weird to me that the absence of the rest of the values is shown with an ellipses, but I’m sure that comes in handy somewhere.

There are also options like :pretty, which enables pretty printing, and :width that limits the number of characters in a string line when pretty printing is invoked. For example:

iex> IO.puts('Hello, #{Kernel.inspect(listicle, [limit: 3, width: 3])}')
Hello, ['a', 'b', 'c', ...]
:ok
iex> IO.puts('Hello, #{Kernel.inspect(listicle, [pretty: true, limit: 3, width: 3])}')
Hello, ['a',
 'b',
 'c',
 ...]
:ok

You can, of course, see all the parameters in the Inspect.Opts documentation.

OK, let’s work our way back up to the polymorphism I promised earlier.

Polymorphic Protocol

String.chars is a protocol. That is, it creates a set of formulas that anyone following the protocol must create for their particular module if they want to proclaim themselves as conforming to a specific protocol.

In String.chars‘ case, there’s only one function to make happen: to_string(). Whatever structure you’re using, to be sure there’s a way for it to be represented on screen or in a log is for it to hew to this protocol, which is just a convention that has to be uniquely implemented for a given library.

So, multiple modules will have a to_string function, and it’ll work the same everywhere because they conform to the protocol.

With String.chars, you can print the values of certain variable types out to a string, such as in the example we started this whole essay with. IO.puts can only print the value of a variable out to the screen as long as the variable is of a type that can be converted to a string, either through String.chars protocol or Kernel.inspect. The #{} is just shorthand for all of that.

To Sum It Up

IO.puts prints stuff to the screen, and you can include variables in that.

When you interpolate a value into a string, you’re using a couple of Kernel modules, adhering to a protocol, and being all polymorphic.

Thankfully, you didn’t go with the <> operator, instead, because then we could travel down a fun road of macro programming that would take another thousand words to get through.

Completely Unrelated

In my research for this Core Elixir installment, I came across a website with “Hello World” examples in dozens of languages.

The ‘winner’ of the page is a tie between the Whitespace language and the Obfuscated Perl example. Nobody comes close to either of those.

Really, that Perl example will blow your mind.

Scheduling Notice

I’m off to OSCon next week, so Core Elixir will be taking a week off. It’ll be back the following week. Thanks for not weeping openly at this news.

If you have any comments, questions, complaints, criticisms, or corrections, catch me on Twitter, @AugieDB. That handle is the same as my GMail account, if you need to type more characters. I want these articles to be factually correct and will update them as necessary.

(7)

Core Elixir: So You Say You Want to Copy a File

To quote Perl, “There’s More Than One Way To Do It.”

In fact, I’m counting six. They come in pairs, though, thanks to our old friend the exclamation point. Let’s pair them off:

File.copy and File.copy!

File.copy has the same exact name and argument order as the Erlang function that does the same thing. Here, look at the source code:

    def copy(source, destination, bytes_count \\ :infinity) do
      F.copy(IO.chardata_to_string(source), IO.chardata_to_string(destination), bytes_count)
    end

(“F” is an alias for the Erlang :file module.)

It literally just converts the strings to the format Erlang prefers and runs the same function.

But what’s with that bytes_count bit at the end? I’ll tell you, but let’s first look at the return values of a successful call to File.copy and File.copy!:

  iex> File.copy('_config.yml', 'testcopyfile_deletemenow.txt')
  {:ok, 3103}

  iex> File.copy!('_config.yml', 'testcopyfile_deletemenow.txt')
  3103

That number that comes back represents the number of bytes that were copied. It is, effectively, the file size.

If you want to pattern match on the results of the copy, then you’ll want to go with cp without the exclamation point at the end.

File.copy has an arity of 3. Yes, there’s a third argument. If you don’t include it, it defaults to :infinity. Otherwise, it acts as a limiter and will only copy over that many bytes of the file to the new one. This is part of the Erlang :file.copy function.

Note: This version of file copying assumes you don’t mind overwriting a file that already exists with the destination’s name. It will do so without warning.

File.cp and File.cp!

I said previously that one of the biggest helps the File module provides in wrapping the Erlang :file module is that it translates Erlang-speak to Unix speak. If you want to copy a file, you can use the cp command with Elixir:

    iex> File.cp('example.txt', 'copy_of_file.txt')
    :ok

And, assuming there are no errors, the cp! function will return the same thing:

    iex> File.cp!('example.txt', 'copy_of_file.txt')
    :ok

As always with the bang, the error message is more human readable with it than without it, unless you’re fluent in POSIX:

    iex> File.cp('file_does_not_exist.txt', 'etc.txt')
    {:error, :enoent}

    iex> File.cp!('file_does_not_exist.txt', 'etc.txt')
    ** (File.CopyError) could not copy recursively from file_does_not_exist.txt to etc.txt: no such file or directory
      (elixir) lib/file.ex:439: File.cp!/3

File.cp_r and File.cp_r!

Things get complicated here.

This is the recursive copy command. If you provide it with a directory as the source, it’ll copy everything in that directory to the directory destination you give it.

It returns the list of files it copied over, which can be extremely handy.

If you give it a single file name, it’ll just copy that single file.

One important note for error checking’s sake here: If the copy fails in the middle, it fails “dirty”. It’ll leave what it had already copied into the source directory. It’s up to you to clean it up.

On success, it’ll return a list of files and directories it has copied. This includes the directory or directories you’re copying into.

iex> File.cp_r('tmp1', 'tmp2')
{:ok, ["tmp2/test3.md", "tmp2/test2.md", "tmp2/test1.md", "tmp2"]}

iex> File.cp_r!('tmp1', 'tmp2')
["tmp2/test3.md", "tmp2/test2.md", "tmp2/test1.md", "tmp2"]

I ran those commands back to back without clearing the second directory. By default, it will just overwrite the files.

You can, however, specify what to do if a file is already found in the destination directory. That’s right, it’s time for another case of The Function with an Extra Arity! File.cp_r is /3. The third parameter is a callback function to handle the source and destination file. If that function returns true, the file will be copied. If false, it won’t be.

Here’s how the function begins in the source code:

 def cp_r(source, destination, callback \\ fn(_, _) -> true end) when is_function(callback) do

Things to notice here:
* There’s a guard clause on there to check that the third parameter is a function.
* If no callback is given, there’s a default function (to match the is_function guard) that gives a default true answer.

So if you want to default to a false answer, then pass a false function. Make something fun up, like

  fn(_, _) -> false end

No, you can do better than that:

  fn(_, _) -> 1 &&& 0 end

Impress your friends! Confuse yourself six months from now with a misplaced sense of humor. Seriously, don’t check that in. You’d be begging for a pull request.

I’ll grab the example given in the documentation and change a couple of things to keep it consistent with what we’ve done so far:

iex(1)> File.cp_r "tmp1", "tmp2", fn(source, destination) ->
...(1)> IO.gets("Overwriting #{destination} by #{source}. Type y to
...(1)> confirm.") == "y"
...(1)> end
Overwriting tmp2/test1.md by tmp1/test1.md. Type y to
confirm.y
Overwriting tmp2/test2.md by tmp1/test2.md. Type y to
confirm.y
Overwriting tmp2/test3.md by tmp1/test3.md. Type y to
confirm.y
{:ok, ["tmp2"]}

It appears that when you fall back to the callback, you lose the list of files you’re copying. If that matters to you, keep this in mind.

More fun: Dig down deep enough in the code and you’ll trip over our old friend, Enum.reduce. Wait, Enum.reduce isn’t your friend yet. I’m writing these Core Elixir posts out of order. We’ll get there; you’ll see. Just remember that name in the meantime…

Out of Context, This Looks Cray-Cray

 defp do_cp_r(_, _, _, acc) do
    acc
  end

Is that Assembly Language? Elixir Morse Code? Elixir Mad Libs?

No, it’s the default pattern match at the end of a list of matches, but its terseness out of context makes it look like utter gibberish. Remember, kids, to always read code in context! (“Code in Context” would make a great PragProg title… Someone else has “Code Complete,” so…)

If you have any comments, questions, complaints, criticisms, or corrections, catch me on Twitter, @AugieDB. That handle is the same as my GMail account, if you need to type more characters. I want these articles to be factually correct and will update them as necessary.

(6)

Core Elixir: Home on the Range Module

It seems so small. Two functions and you’re done.

Then you open up the source and see the protocols at work. That’s when you realize that the thing you think of as a range is really a little more complicated than that. A range is not a core Elixir data type. It’s actually just a Range struct:

def new(first, last) do
  %Range{first: first, last: last}
end

It’s a simple and specific struct with all of two values, as you might suspect: the first one and the last one. You get the two dots in the middle for free.

The module also has a test to see if it’s a range:

def range?( %Range{} ), do: true
def range?(_), do: false

It’s almost too obvious, isn’t it? If you pass in an item that pattern matches a Range struct, then you’re a range. Otherwise, you’re not.

Naming wise, it’s a bit different than you might expect. With other data types, Elixir comes pre_loaded with is functions: Does there exist an is_range function? Let’s try it:

iex> r = 1..3
1..3
iex> is_range?(r)
** (RuntimeError) undefined function: is_range?/1

Nope. Let’s take a look. From an iex prompt, type in is_ and hit tab for the auto-complete options:

iex(6)> is_
is_atom/1         is_binary/1       is_bitstring/1    is_boolean/1
is_float/1        is_function/1     is_function/2     is_integer/1
is_list/1         is_map/1          is_nil/1          is_number/1
is_pid/1          is_port/1         is_reference/1    is_tuple/1

Range isn’t on that list. Nor, you may notice is there an is_struct. You need the range? function mentioned above instead:

iex> Range.range?(r)
true

I have to admit that that does look a little weird to me. Could look stranger, though:

iex> range = 1..3
iex> Range.range?(range)
true

This tutorial is showing great range, isn’t it?

The Protocols. Oh, the Protocols!

There are three protocols that Range maps to, and that’s what takes up the second half of the module.

First, the type is enumerable, so it has to define its reduce function to be compatible with everything in the Enum library. When we get to the Enum library in a future “Core Elixir” installment, you’ll see that almost all of the Enum functions are using reduce.

It also implements member? to tell if a value comes inside the range.

def member?(first .. last, value) do
  if first <= last do
    {:ok, first <= value and value <= last}
  else
    {:ok, last <= value and value <= first}
  end
end

That’s fairly simple, too, though Perl6 has it beat:

$low < $value < $high

I’m sure it has to do with the way languages are parsed and Abstract Syntax Trees and all the rest, but why doesn’t every language these days have that? Why do we need a special function or two tests to prove that out?

My kingdom for a macro!

Count Along with Range

Finally, the Range module has a count function, which you can likely guess the purpose of.

Here’s the funny trick there: The count function uses a count function that’s defined as part of the Range.Iterator protocol! Yes, it’s protocols all the way down!

Range.Iterator has count and next defined, for what that might be worth. I won’t bore you too much with those because I bet you’re smart enough to figure out how they work.

Finally, you need to have some way of inspecting the range so, of course, you implement the Inspect protocol!

defimpl Inspect, for: Range do    
  import Inspect.Algebra

  def inspect(first .. last, opts) do
    concat [ to_doc(first, opts), "..", to_doc(last, opts) ]
  end
end

Some background: to_doc is in the Algebra module. I never would have guessed that, but thankfully the Elixir documentation has great search functionality.

The purpose of to_doc is to convert an Elixir structure to an “algebra document.” In other words, it translates a number to a string so it can be printed out. (This is a gross oversimplification, but it’s good enough, I think.)

Long story, short: inspect makes sense, converting the struct to the first number, dot, dot, the last number.

iex> inspect(r)
"1..3"

Hey, nothing in a programming language appears by magic, even something so simple of showing the value of a variable. It all needs to be coded in somewhere.

Your Range Power Tip of the Day

You can have descending ranges! 10..1 is just as valid as 1..10, though obviously your outcome will be wildly different for many functions.

Look through the Range code enough and you’ll find places where that has to be kept in mind as the program runs. The simplest example I can give you is the Range.member?/2 function which, as you might imagine, checks a range to see if a specific value falls within it:

def member?(first .. last, value) do
  if first <= last do
    {:ok, first <= value and value <= last}
  else
    {:ok, last <= value and value <= first}
  end
end

If you have any comments, questions, complaints, criticisms, or corrections, catch me on Twitter, @AugieDB. That handle is the same as my GMail account, if you need to type more characters. I want these articles to be factually correct and will update them as necessary.

(5)

Core Elixir: Integer.is_even/1 and Integer.is_odd/1

Is this number odd or even?

This should be simple, right? Oldest trick in the book: If the modulo (remainder) of the number divided by 2 is 0 (has no remainder), then it’s even. Otherwise, it’s odd.

In fact, here, I’ll give you an example with the Elixir rem command, which returns the remainder after dividing the two arguments you send it:

iex> rem(10, 2)
0
iex> rem(11,2)
1

The first one is even. The second one is odd. Done and done. Still simple, right?

Elixir’s rem is actually Kernel.rem and calls out to Erlang’s :erlang.rem to do the actual work.

But that’s not how it works in Elixir. It’s much fancier, and much lower-level.

Back to Bits

Elixir reaches into its back pocket and pulls out a macro on this problem.

I’m not going to get into what a macro is or how it works here. Chris McCord wrote a book about it. Since I haven’t read it yet, it would be irresponsible of me to attempt to break that down any further.

But the trick it uses to determine whether an integer is odd or even is the Bitwise AND operator. I’ll attempt to provide a lesson on that today, instead. It’s likely equally irresponsible of me to attempt this, but I have a Computer Science degree, so we should be good, right?

B to the AND

The Bitwise AND operator converts a number to bits, i.e. it turns it into 1s and 0s. This is ridiculously handy for determining when something is even or odd. Why? That last bit will give it away. If the low bit (the one on the far right side) is turned off (0), the number is even. If it’s turned on (add 1), the number is odd.

Remember, that all the other bits before that last number are even numbers (2, 4, 8, 16, etc.) Only that last one, 1, is odd. And if there’s anything you might remember from second grade math, it’s that even + even is even and even + odd is odd. (Odd + odd is also even, but we don’t have two odd bits to play with.)

So you AND 1 (that is, “00000001”) to the number. 0 AND anything gives you 0, so you effectively zero out all the bits except for maybe that last one.

It comes down to that last bit: 1 AND 1 gives you 1. 0 AND 1 gives you 0. A final value of 1 is odd, 0 is even.

In other words :

So, is_odd returns true when that computation results in 1, and is_even returns true when that computation results in a 0. That’s exactly what the source code does:

defmacro is_odd(n) do
  quote do: (unquote(n) &&& 1) == 1
end

defmacro is_even(n) do
  quote do: (unquote(n) &&& 1) == 0
end

(Even not knowing much about macros, that code is pretty readable.)

One last example of how Binary AND works:

Now you can feel closer to the metal and, like a smarter and more powerful programmer. You just played with the bits!

The Erlang Connection

One last thing I should mention: The &&& operator runs Erlang’s :band function ( B itwise AND, get it?) See the source here.

‘&&&’ is included in Elixir’s Bitwise module, which must be included before you can use it in your modules, as it is in Elixir’s Integer library. Otherwise, you get an error that tells you to do just that:

iex> Bitwise.&&&(20, 1)
** (CompileError) iex:22: you must require Bitwise before invoking the macro Bitwise.&&&/2
    (elixir) src/elixir_dispatch.erl:97: :elixir_dispatch.dispatch_require/6

I only point this out because the error message is so helpful. As I delve deeper into the Elixir core, I see more and more of the most helpful and specific error messages I’ve ever seen in a computer language. That’s handy.

So, use the library and compute away:

iex> use Bitwise
iex> Bitwise.&&&(20, 1)  # 20 is even, so should return 0
0

Note that the return value is the same as if you had done the old divide-by-two trick. No remainder indicates it’s even. And if there is a 1 leftover:

iex> Bitwise.&&&(19, 1) # 19 is odd, so should return 1
1

Boom! Odd.

Including and Requiring

I mentioned before that is_odd and is_even are macros, right? That means that, unlike the Bitwise operator, we can’t just use the Integer module. We need to require or import it. That will ensure that the macros are created and ready to go at compilation time. (The Elixir Getting Started Guide explains this fully, and even uses Integer as its example.)

What’s the difference between the two? With require, you still need to include the module name in your commands:

iex> require Integer
iex> Integer.is_even(20)
true

With import, you don’t need to use the module name anymore:

iex> import Integer
iex> is_odd(20)
false

You also have the ability to only import the specific functions you need, or to include only all of the macros in the module.

iex> import Integer, only: :macros
iex> is_even(20)
true

And Bob’s your uncle.

If you have any comments, questions, complaints, criticisms, or corrections, catch me on Twitter, @AugieDB. That handle is the same as my GMail account, if you need to type more characters. I want these articles to be factually correct and will update them as necessary.

(4)