Introduction to Address Resolution Protocol (ARP)
Why an IP address and port is not enough
Table of contents
ARP is used to convert a logical address (IP) to a physical address (MAC). MAC address is a machine's unique address. The true destination of a packet.
The application data has to pass through all the layers from top to bottom before getting sent across the internet.
The transport layer gives the destination IP address. That's not enough. How do we know the port?
The Internet layer provides the destination port but we are not done yet. Going one layer down.
The MAC address is required to uniquely identify a particular device globally. It is provided by the Address Resolution Protocol (or ARP). But there's a catch. ARP can only ask devices on its local network for the MAC address (since it is a layer 2 protocol).
At first glance, this seems counter-intuitive.
Does this mean that my laptop will never be able to communicate with Google servers because they're not on the same local network? Not true.
How does ARP work for remote servers?
- My laptop wants to connect to Google.com servers.
- It hits the default gateway or router with the request containing the destination IP address (by performing DNS resolution of google.com) and port.
- Router quickly identifies that Google servers are not on the same network.
- At this point, the destination MAC address was pointing to our default gateway or router.
- Destination IP address pointing to the IP address of Google.com servers.
- The router sends the packet to the next hop (because it only knows its neighbors' MAC address) in the network. Now,
- Destination MAC address is pointing to the next-hop router's MAC address.
- Destination IP address still pointing to google.com's IP.
- The process repeats until the packet reaches Google's local network. At the second last stage,
- Destination MAC = Google's server MAC address
- Destination IP still pointing to google.com.
A little experiment
Run the command arp -a
in your terminal. Here's what I get:
This is the list of cached IP addresses to MAC addresses mapping stored on my laptop.
- The first entry is for visiting my router's management console using the en0 interface aka WiFi (run
ifconfig -v en0
and check forType
) - The second entry does the same but uses an ethernet interface or en7.
- The third is the IP assigned to my ethernet connection (using DHCP).
- If you observe the fourth and the fifth entries, they have the absolute last possible IP address. This indicates they are reserved for broadcast (can tell from their MAC address as well).
- Finally, the rest of the three entries are multicast addresses.
You might ask why we don't see Google's IP to MAC mapping here? Again, this is the arp cache table for my router. It is not directly connected (aka non-zero hops) to Google's servers. The router directly connected to their server will have that mapping.
ARP protocol is very interesting, there's definitely a lot more to learn here (like how APR poisoning works).
Next time someone asks "What happens when you enter a URL in your browser and hit enter", I'll have a bit deeper understanding and a longer answer ๐.