Featured Post

The great debacle of healthcare.gov

This is the first time in history when the president of the United States of America, or probably for any head of state around the world,...

Friday, July 3, 2015

Digital Signature and how it is used on the Internet

Digital Signature is, as it implies, the digital equivalent of handwritten signature that carries a distinctive digital pattern to identify a person or system. The public-key cryptography is used to sign a document that would provide a non repudiation way to prove that a document is sent by the owner of the document and not tempered with in the middle. Let's see this in action -

  • The process starts by generating a key-pair: public key and private key. The public key is distributed publicly to the world and private key is kept secret to the owner of the key pair. The distribution of public key is done in various way, such as: key signing party, publishing on a well known website etc.
  • Now, using an one-way cryptographic hash function (e.g. MD5, SHA-1, SHA-2, etc.), the message digest or hash code is generated from the original document. Hash functions take an arbitrarily long piece of plaintext and compute from it a fixed length string
  • The message digest is then encrypted using the private key and the encrypted hash code is appended to the document. This is the digital signature
  • The receiver would first generate the hash code of the document using the same hash function of the sender
  • The receiver decrypts the added hash code using the sender’s public key
  • The receiver compares these two hash codes and if they matches, then it’s proved without doubt that the document is sent by the owner and moreover, the document wasn’t tempered in the delivery. If the two hash codes do not match, it’s an indication that either the document is not sent by the owner or it’s tempered in the communication

How your web browser gets the IP address of a website?

Domain Name System (DNS) in the Internet that works tirelessly to give back the answer to your browser when it needs an IP address of a website. DNS is a hierarchical architecture that allows the resolution of the human readable names of a computer in the internet into a machine usable IP address. 

The process of resolving the website address, for instance, Youtube server, typed on the browser is as follows:

  • As this is the first time accessing the www.youtube.com, the browser creates a DNS query to the local DNS server that’s configured in the operating system. This would be the DNS server in your ISP network (e.g. Comcast) which is called as caching DNS

  • If the Comcast DNS never communicated with the Youtube server (which would be very unlikely), this wouldn’t find any entry in its cache. This DNS server makes a request to the root name server. The root name server is configured manually in the Comcast DNS server

  • The root name server is the authoritative server that looks up the right most portion of the domain name and returns the name server(s) for the Top Level Domain (TLD) of the dot com (".com") domain

  • As the Comcast DNS gets the "com" TLD name server, it makes another request to the "com" TLD DNS server which in turn returns the name of the Youtube's ISP (I presume it's Google) DNS server’s IP address.

  • Now the Comcast's DNS server makes the last request to the Google's DNS server to resolve the IP address for the domain www.youtube.com. This ends the recursive calls made by this caching server and creates a cached entry of this resolved IP address with a Time To Live (TTL) value. The TTL tells when this cache entry would expire. It respond back to your computer (e.g. laptop) which made the initial request 

  • Your laptop now knows the IP address to make the request to the www.youtube.com server to retrieve the web page on the browser. It would also cache the resolved entry in the Operating System level to make the subsequent request to the same server much faster by avoiding all the above calls until the cache entry expires

It's good to know the concept of DNS caching in a little detail.  DNS caching is the process through which the local DNS server (known as caching DNS) stores the already resolved IP address for a certain period of time. As mentioned in the above answer (5a), the computer operating system and local DNS servers first look into the stored cache for the IP address of a domain name. If the cache doesn’t have the record available, it reaches out to the authoritative DNS servers (root DNS, TLD DNS, destination  network’s DNS, etc.) to resolve for the IP and then stores that into the the caching DNS server as well as in the local computer’s OS. 

The caching of each record has an expiration time tagged along with that. This is called Time-To-Live (TTL) which is set (in terms of seconds) by the authoritative DNS. The cache is deleted when the TTL time is elapsed. At that time, if that domain name is requested for the IP address, it has to go through the same DNS resolution process. The DNS caching is used to improve the performance of the DNS resolution. This not only helps improving the performance of the network usage but also keeps the Internet free of DNS query traffics   

Proxy server and its benefit in a computer network

Proxy server is a middle-ware box that sits between the requesting user’s machine and the destination server to provide control to the access. Here’s how a proxy server technically operates:

  • The Proxy server is configured in the network through which all the traffic would pass through. This is not a transparent middleware box like NAT box, so the client (web browser) accessing the webserver has to configure the proxy server address

  • When a request comes from the web browser to the proxy server, the Proxy server first checks its cache for that requested information. If the information is available at the cache i.e. requested by some other users earlier, it returns the information back to the web browser without going to the web server. This saves a valuable network bandwidth and time

  • If the requested information isn’t available in the cache, it opens up a new request to the destination web server (acting like NAT) and caches the responded information into its cache

  • Then it returns the information back to the original requester

You can implement a proxy server to reap the below benefits:

  • Proxy server improves the performance of the network as the cached information are reused to server multiple users. Thus saving network bandwidth and time

  • Proxy server increases the security in the network as you would have more control on what information it would be allowed for users to access by restricting malicious sites. It also keeps log of all the traffic so that can be used to investigate security issues and vulnerabilities as well

  • You can set your own policy in a convenient manner through a proxy server, like restricting access to illegal sites, unproductive sites etc

NFS vs FTP in a nutshell: The Networking perspective

File Transfer Protocol (FTP) is a client server model protocol to transfer files in a network, implemented on top of TCP. FTP works much like HTTP where FTP client makes specific requests to the FTP server thus not transparent to the client. Though the implementation of FTP is simpler than other protocol (like NFS), but it works a bit different way. The FTP client and Server treats control and data differently. At the connection, the FTP client connects to the standard port 21 and then agree upon to a different port to connect for the data to be transferred. The data connection can be initiated from either side depending on the implementation and network configuration. The primary use of FTP is for repository of the shared files to centrally manage them. Because of the simplicity, it allows easy access to the files across the network. A side note: DO NOT use FTP, use SFTP. FTP sends your password in plain text.

Network File System (NFS) is also a file sharing protocol which is implemented on top of RPC which typically runs on UDP. NFS is different from FTP in a sense that it does provide transparent access to the remote files resides in NFS server i.e. the NFS client access files (local and remote) in the same manner from the end user perspective. NFS server is a broadcast server that publishes itself within the network so any client connected to that network gets aware of it without explicitly looking for it. Another important distinction is that NFS doesn’t maintain the state of the connection so it is much more resilient to the network connection instability. NFS is used to centrally managed shared files (similar to FTP) without being duplicated across the network. As the NFS is transparent access, it is used to create a home directory for users in the network where that directory can be mounted to any machine, which would provide user the feeling that the home directory is connected to the local machine.