Strings

Strings in Elixir are inserted between double quotes, and they are encoded in UTF-8. Unlike C and C++ where the default strings are ASCII encoded and only 256 different characters are possible, UTF-8 consists of 1,112,064 code points. This means that UTF-8 encoding consists of those many different possible characters. Since the strings use utf-8, we can also use symbols like: ö, ł, etc.

Create a String

To create a string variable, simply assign a string to a variable −

str = "Hello world"

To print this to your console, simply call the IO.puts function and pass it the variable str −

str = str = "Hello world" 
IO.puts(str)

The above program generates the following result −

Hello World

Empty Strings

You can create an empty string using the string literal, “”. For example,

a = ""
if String.length(a) === 0 do
   IO.puts("a is an empty string")
end

The above program generates the following result.

a is an empty string

String Interpolation

String interpolation is a way to construct a new String value from a mix of constants, variables, literals, and expressions by including their values inside a string literal. Elixir supports string interpolation, to use a variable in a string, when writing it, wrap it with curly braces and prepend the curly braces with a ‘#’ sign.

For example,

x = "Apocalypse" 
y = "X-men #{x}"
IO.puts(y)

This will take the value of x and substitute it in y. The above code will generate the following result −

X-men Apocalypse

String Concatenation

We have already seen the use of String concatenation in previous chapters. The ‘<>’ operator is used to concatenate strings in Elixir. To concatenate 2 strings,

x = "Dark"
y = "Knight"
z = x <> " " <> y
IO.puts(z)

The above code generates the following result −

Dark Knight

String Length

To get the length of the string, we use the String.length function. Pass the string as a parameter and it will show you its size. For example,

IO.puts(String.length("Hello"))

When running above program, it produces following result −

5

Reversing a String

To reverse a string, pass it to the String.reverse function. For example,

IO.puts(String.reverse("Elixir"))

The above program generates the following result −

rixilE

String Comparison

To compare 2 strings, we can use the == or the === operators. For example,

Live Demo

var_1 = "Hello world"
var_2 = "Hello Elixir"
if var_1 === var_2 do
   IO.puts("#{var_1} and #{var_2} are the same")
else
   IO.puts("#{var_1} and #{var_2} are not the same")
end

The above program generates the following result −

Hello world and Hello elixir are not the same.

String Matching

We have already seen the use of the =~ string match operator. To check if a string matches a regex, we can also use the string match operator or the String.match? function. For example,

IO.puts(String.match?("foo", ~r/foo/))
IO.puts(String.match?("bar", ~r/foo/))

The above program generates the following result −

true 
false

This same can also be achieved by using the =~ operator. For example,

IO.puts("foo" =~ ~r/foo/)

The above program generates the following result −

true

String Functions

Elixir supports a large number of functions related to strings, some of the most used are listed in the following table.

Sr.No.Function and its Purpose
1at(string, position)Returns the grapheme at the position of the given utf8 string. If position is greater than string length, then it returns nil
2capitalize(string)Converts the first character in the given string to uppercase and the remainder to lowercase
3contains?(string, contents)Checks if string contains any of the given contents
4downcase(string)Converts all characters in the given string to lowercase
5ends_with?(string, suffixes)Returns true if string ends with any of the suffixes given
6first(string)Returns the first grapheme from a utf8 string, nil if the string is empty
7last(string)Returns the last grapheme from a utf8 string, nil if the string is empty
8replace(subject, pattern, replacement, options \\ [])Returns a new string created by replacing occurrences of pattern in subject with replacement
9slice(string, start, len)Returns a substring starting at the offset start, and of length len
10split(string)Divides a string into substrings at each Unicode whitespace occurrence with leading and trailing whitespace ignored. Groups of whitespace are treated as a single occurrence. Divisions do not occur on non-breaking whitespace
11upcase(string)Converts all characters in the given string to uppercase

Binaries

A binary is just a sequence of bytes. Binaries are defined using << >>. For example:

<< 0, 1, 2, 3 >>

Of course, those bytes can be organized in any way, even in a sequence that does not make them a valid string. For example,

<< 239, 191, 191 >>

Strings are also binaries. And the string concatenation operator <> is actually a Binary concatenation operator:

IO.puts(<< 0, 1 >> <> << 2, 3 >>)

The above code generates the following result −

<< 0, 1, 2, 3 >>

Note the ł character. Since this is utf-8 encoded, this character representation takes up 2 bytes.

Since each number represented in a binary is meant to be a byte, when this value goes up from 255, it is truncated. To prevent this, we use size modifier to specify how many bits we want that number to take. For example −

IO.puts(<< 256 >>) # truncated, it'll print << 0 >>
IO.puts(<< 256 :: size(16) >>) #Takes 16 bits/2 bytes, will print << 1, 0 >>

The above program will generate the following result −

<< 0 >>
<< 1, 0 >>

We can also use the utf8 modifier, if a character is code point then, it will be produced in the output; else the bytes −

IO.puts(<< 256 :: utf8 >>)

The above program generates the following result −

Ā

We also have a function called is_binary that checks if a given variable is a binary. Note that only variables which are stored as multiples of 8bits are binaries.

Bitstrings

If we define a binary using the size modifier and pass it a value that is not a multiple of 8, we end up with a bitstring instead of a binary. For example,

bs = << 1 :: size(1) >>
IO.puts(bs)
IO.puts(is_binary(bs))
IO.puts(is_bitstring(bs))

The above program generates the following result −

<< 1::size(1) >>
false
true

This means that variable bs is not a binary but rather a bitstring. We can also say that a binary is a bitstring where the number of bits is divisible by 8. Pattern matching works on binaries as well as bitstrings in the same way.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *