Featured Post

The great debacle of healthcare.gov

This is the first time in history when the president of the United States of America, or probably for any head of state around the world,...

Friday, July 3, 2015

Digital Signature and how it is used on the Internet

Digital Signature is, as it implies, the digital equivalent of handwritten signature that carries a distinctive digital pattern to identify a person or system. The public-key cryptography is used to sign a document that would provide a non repudiation way to prove that a document is sent by the owner of the document and not tempered with in the middle. Let's see this in action -

  • The process starts by generating a key-pair: public key and private key. The public key is distributed publicly to the world and private key is kept secret to the owner of the key pair. The distribution of public key is done in various way, such as: key signing party, publishing on a well known website etc.
  • Now, using an one-way cryptographic hash function (e.g. MD5, SHA-1, SHA-2, etc.), the message digest or hash code is generated from the original document. Hash functions take an arbitrarily long piece of plaintext and compute from it a fixed length string
  • The message digest is then encrypted using the private key and the encrypted hash code is appended to the document. This is the digital signature
  • The receiver would first generate the hash code of the document using the same hash function of the sender
  • The receiver decrypts the added hash code using the sender’s public key
  • The receiver compares these two hash codes and if they matches, then it’s proved without doubt that the document is sent by the owner and moreover, the document wasn’t tempered in the delivery. If the two hash codes do not match, it’s an indication that either the document is not sent by the owner or it’s tempered in the communication

How your web browser gets the IP address of a website?

Domain Name System (DNS) in the Internet that works tirelessly to give back the answer to your browser when it needs an IP address of a website. DNS is a hierarchical architecture that allows the resolution of the human readable names of a computer in the internet into a machine usable IP address. 

The process of resolving the website address, for instance, Youtube server, typed on the browser is as follows:

  • As this is the first time accessing the www.youtube.com, the browser creates a DNS query to the local DNS server that’s configured in the operating system. This would be the DNS server in your ISP network (e.g. Comcast) which is called as caching DNS

  • If the Comcast DNS never communicated with the Youtube server (which would be very unlikely), this wouldn’t find any entry in its cache. This DNS server makes a request to the root name server. The root name server is configured manually in the Comcast DNS server

  • The root name server is the authoritative server that looks up the right most portion of the domain name and returns the name server(s) for the Top Level Domain (TLD) of the dot com (".com") domain

  • As the Comcast DNS gets the "com" TLD name server, it makes another request to the "com" TLD DNS server which in turn returns the name of the Youtube's ISP (I presume it's Google) DNS server’s IP address.

  • Now the Comcast's DNS server makes the last request to the Google's DNS server to resolve the IP address for the domain www.youtube.com. This ends the recursive calls made by this caching server and creates a cached entry of this resolved IP address with a Time To Live (TTL) value. The TTL tells when this cache entry would expire. It respond back to your computer (e.g. laptop) which made the initial request 

  • Your laptop now knows the IP address to make the request to the www.youtube.com server to retrieve the web page on the browser. It would also cache the resolved entry in the Operating System level to make the subsequent request to the same server much faster by avoiding all the above calls until the cache entry expires

It's good to know the concept of DNS caching in a little detail.  DNS caching is the process through which the local DNS server (known as caching DNS) stores the already resolved IP address for a certain period of time. As mentioned in the above answer (5a), the computer operating system and local DNS servers first look into the stored cache for the IP address of a domain name. If the cache doesn’t have the record available, it reaches out to the authoritative DNS servers (root DNS, TLD DNS, destination  network’s DNS, etc.) to resolve for the IP and then stores that into the the caching DNS server as well as in the local computer’s OS. 

The caching of each record has an expiration time tagged along with that. This is called Time-To-Live (TTL) which is set (in terms of seconds) by the authoritative DNS. The cache is deleted when the TTL time is elapsed. At that time, if that domain name is requested for the IP address, it has to go through the same DNS resolution process. The DNS caching is used to improve the performance of the DNS resolution. This not only helps improving the performance of the network usage but also keeps the Internet free of DNS query traffics   

Proxy server and its benefit in a computer network

Proxy server is a middle-ware box that sits between the requesting user’s machine and the destination server to provide control to the access. Here’s how a proxy server technically operates:

  • The Proxy server is configured in the network through which all the traffic would pass through. This is not a transparent middleware box like NAT box, so the client (web browser) accessing the webserver has to configure the proxy server address

  • When a request comes from the web browser to the proxy server, the Proxy server first checks its cache for that requested information. If the information is available at the cache i.e. requested by some other users earlier, it returns the information back to the web browser without going to the web server. This saves a valuable network bandwidth and time

  • If the requested information isn’t available in the cache, it opens up a new request to the destination web server (acting like NAT) and caches the responded information into its cache

  • Then it returns the information back to the original requester

You can implement a proxy server to reap the below benefits:

  • Proxy server improves the performance of the network as the cached information are reused to server multiple users. Thus saving network bandwidth and time

  • Proxy server increases the security in the network as you would have more control on what information it would be allowed for users to access by restricting malicious sites. It also keeps log of all the traffic so that can be used to investigate security issues and vulnerabilities as well

  • You can set your own policy in a convenient manner through a proxy server, like restricting access to illegal sites, unproductive sites etc

NFS vs FTP in a nutshell: The Networking perspective

File Transfer Protocol (FTP) is a client server model protocol to transfer files in a network, implemented on top of TCP. FTP works much like HTTP where FTP client makes specific requests to the FTP server thus not transparent to the client. Though the implementation of FTP is simpler than other protocol (like NFS), but it works a bit different way. The FTP client and Server treats control and data differently. At the connection, the FTP client connects to the standard port 21 and then agree upon to a different port to connect for the data to be transferred. The data connection can be initiated from either side depending on the implementation and network configuration. The primary use of FTP is for repository of the shared files to centrally manage them. Because of the simplicity, it allows easy access to the files across the network. A side note: DO NOT use FTP, use SFTP. FTP sends your password in plain text.

Network File System (NFS) is also a file sharing protocol which is implemented on top of RPC which typically runs on UDP. NFS is different from FTP in a sense that it does provide transparent access to the remote files resides in NFS server i.e. the NFS client access files (local and remote) in the same manner from the end user perspective. NFS server is a broadcast server that publishes itself within the network so any client connected to that network gets aware of it without explicitly looking for it. Another important distinction is that NFS doesn’t maintain the state of the connection so it is much more resilient to the network connection instability. NFS is used to centrally managed shared files (similar to FTP) without being duplicated across the network. As the NFS is transparent access, it is used to create a home directory for users in the network where that directory can be mounted to any machine, which would provide user the feeling that the home directory is connected to the local machine.

Tuesday, June 30, 2015

What makes a network IPv6 capable

To declare a network is capable of IPv6 connectivity, there are certain infrastructural components that have to be in place before hand.

Operating System: The operating system (or the client) has to be IPv6 enabled client. This could be in the form of dual stack client or IPv6 only client so that it would have IPv6 address assigned to it.

Dynamic Host Configuration Protocol (DHCP6) server: Now a days, almost every network is managed by DHCP to assign IP address, though this is not a mandatory device. But to declare that the network is IPv6 enabled, there has to be a DHCP6 server if the network needs to be managed in a stateful manner to autoconfigure the network IP addresses. The alternate is stateless autoconfiguration that doesn’t need DHCP6 server (which comes with some security risks). In that case no DHCP6 server is required.

Router: the router has to be able to recognize and process IPv6 packets. Other option is tunneling which doesn’t say that the network is IPv6 enable but just a work around

Domain Name System: the DNS has to be capable to resolve IPv6 addresses so that the source host can reach out the destination IPv6 hosts

There are few other types of devices that are sometime placed in the network like NAT, Proxy server, Firewall. If these devices are present in the network (and probably are), then all these devices should also be IPv6 aware so that the end to end connectivity can take place. 

There’s a hard way to make the IPv4 network works for IPv6 communication, which is through tunneling. In that way the IPv6 packets are transmitted by encapsulating into IPv4 packets. But this is a complicated way to achieve IPv6 connectivity with high cost of configuration and performance

Flow control & Congestion control: The two most important features of TCP that keeps the Internet alive

Flow control is the mechanism where the sender and receiver sync up the data rate between them to not to overwhelm the receiver, in the case where the receiver has less capacity than the sender.

Congestion control, on the other hand, is the sender trying to figure out what the network is able to handle. This the mechanism at the sender’s end to determine through the data loss on the transmission link and adjust the throttle accordingly to be most efficient.

Both flow and congestion controls are necessary to effectively transmit data from sender to receiver. Without flow control, the sender would overwhelm the receiver’s buffer and the data would be discarded and also sender would be forced to continuously re-transmit the unacknowledged data. This would tremendously impact the performance of the TCP protocol, throughput and the performance of the network link. Similarly congestion control helps the sender to determine if the data being sent over the network are capable of delivery to the receiver or not. There could be situation where both the sender and receiver are perfectly fine to accept a higher data rate but if the link in between isn’t capable enough, then a lot of bandwidth would be wasted just to re-transmit the lost data on the link. This would effectively make the data transmission slower than the true capacity of the link. Though both are necessary to achieve the optimal performance but they are essentially two different things:
  • Flow control is between sender and receiver, whereas congestion control is between the sender and the network
  • Flow control is dictated mostly by the receiver through negotiation, whereas the sender dictates the congestion control
  • Flow control is to sync up the data transmission between sender and receiver, whereas congestion control is to sync up the data transmission between the sender and the network link
  • Flow control is end to end but congestion control is not end to end but resides at the sender’s end alone

Implementation of flow control: TCP uses sliding window model to implement flow control. This is achieved through the use of advertised window size from the receiver. The receiver communicates the buffer size during the connection establishment and can change it anytime during the life cycle of the connection. The receiver and sender negotiates the buffer size where the SWS (Sender Window Size) is set to RWS (Receiver Window Size) that ensures that the Sender isn't sending more data than the Receiver can receive before acknowledging them.

Implementation of congestion control: TCP probes the network by starting with small amount of data to come up with optimal transmission rate in the sliding window model. TCP uses a new variable in the sliding window called congestion window, to control the streaming rate of bytes. In conjunction with sliding window’s advertised window size, this congestion window helps to determine the maximum size of the allowed window, which is the minimum of those two windows. Unlike advertised window, the congestion window is determined by the sender, determined by the network link’s ability on the data transmission. The loss of data is used as indication of congestion on the link and set the congestion window accordingly. TCP considers that the network is otherwise reliable (wireless is handled differently through).

There are various techniques that are used to implement congestion control: Slow start, Additive Increase/Multiplicative Decrease (AIMD), Fast re-transmit, Fast recovery etc. In AIMD, TCP starts streaming bytes at a minimum rate and increase the rate in an additive fashion. Another implementation is to use slow start with a small amount and then increase the rate exponentially up to the congestion threshold level. After that it goes back to additive increase until congestion is sensed, which triggers the TCP to sharply decrease the rate and also reset the congestion threshold to a lower number (depending on the implementation). This continues throughout the life cycle of the connection and sync up with the network link’s ability to handle the transmission

Monday, June 29, 2015

Basic concept of Collision and Broadcast Domains in Computer Networking

Collision Domain is the group of computer devices that are connected to each other in a topology where every packet transmitted over the network has the potentiality to collide on the network link. The 802.3 network uses the CSMA/CD method to send data through a collision domain. In the 802.3 network, only one device suppose to send a frame over the network when it finds that no other devices is using the link, i.e. the network is free. But the collision happens when two (or multiple) devices sense the network as free and start sending frames. The frames then collides.

Broadcast Domain is the concept where every device connected to a network is able to reach out to all other devices with a single message sent using a special messaging, known as broadcast message. In 802.3 network, by default, every device is part of the broadcast domain as they listen to every data frame sent over that network. In the broadcast domain, every Network Interface Card (NIC) receives every frame transmitted over but discards all but the one addressed to itself. The exception is the broadcast message which is accepted by every NICs. Thus, in a broadcast domain, any device can reach out to every device at any point of time.

Ethernet Hub is a dumb device that forwards every Ethernet frame to all other ports in it thus creating a large single collision domain for the devices connected to its ports. This also creates a single broadcast domain for the connected devices. Essentially, it’s like a bus topology where all the devices are connected to a thick cable.

Ethernet Switch creates multiple collision domains, determined by the number of ports in it i.e. a single collision domain with the devices that are connected to a single port. When a device sends a frame to another device connected to the switch, it forwards the frame only to the port at which the the destination device is connected to. In that way, if multiple devices are connected to each other (like a Star of Stars topology) then those devices connected to that port, forms one single collision domain. In the other hand, all the devices connected to a switch form a single broadcast domain, i.e. one device still can reach out to all the devices connected to the switch using a single broadcast message. So it can be said that an Ethernet switch creates a single broadcast domain while breaking that into a multiple collision domains.

Router breaks both the collision and broadcast domain into a single port level. That means, every device connected its one port creates a collision domain and a broadcast domain. The purpose of a router is to connect multiple networks, thus one port of a router creates a single collision domain for that network as well as a single broadcast domain. The network connected to the router decides on its internal detail. All the broadcast packets are dropped at the router.




References:

Tuesday, January 6, 2015

Software Development with Built-in Security

Quite often we encounter the situation where Software Security has been an afterthought item which is to be checked off by the group that is responsible for the safeguarding of software security. That approach had been effective to some extent while the software was sitting in an isolated silo, totally disconnected from the outside world and accessed only within the boundary of that organization. But the ubiquitous presence of Internet demands an immediate change in that approach. Time has come to shift the paradigm from reactive software security to a more proactive software security where security would not be implanted into the software after it is built rather the security would be a built-in feature organically grown from the very beginning of the software inception.

So, how the software security can be built into the core fabric of software? This requires not just a change in the process and perception but a complete cultural change where the software security isn't the headache of the security guy sitting isolated in a cubicle at the corner of office, but a collaborative responsibility of everyone who are involved in the software development life cycle (SDLC) i.e. Analysis, Design, Development, Testing and Implementation.

THE GOAL OF SOFTWARE SECURITY

Let's first clarify the goals of the Software Security i.e. what we would like to achieve through this journey of secured software development. The goals are to achieve Confidentiality, Integrity, and Availability (CIA) in that Software. Confidentiality makes sure that the software provides access to the assets (i.e. data, processing, capability etc.) only to the authorized users in a need to know basis. Integrity is the attribute where the software ensures that the assets are kept in a consistent manner in every interactions. Availability is the characteristics that ensures to provide a guaranteed level of availability of the software that's agreed upon by the software provider and the users. This availability is to be considered not by just the availability in terms of being accessible but accessible with certain level of usability, throughput, and completeness. Each phase of the software would have to make sure that the CIA goals are met or, at least, taken into consideration.

PHASES OF SOFTWARE SECURITY DEVELOPMENT

The entire process to effectively incorporate the security in software can be broken into four phases: Planning, Execution, Monitoring and Controlling. These four phases would effectively encompass the secure software development cycle, as depicted by the below diagram.


PLANNING

This is the most overlooking part when the secure software development is taken into consideration. Most of the time, the goals of the secure software are not met due to the poor or, in some cases, lack of planning. The planning process starts with developing the defining the goals of the software security and ends by creating security execution plan. Here’s a high level depiction of planning process and the desired artifacts to be produced through that process.




Below is the deep dive to the detail of the planning process:

1.1. First step is to define the goals that the software security is targeting to achieve. The goals shouldn't be just a vague statement like “we want the software to be resilient and secured when under attack” but the goals should follow the SMART criteria i.e. Specific, Measurable, Achievable, Realistic and Time bound. Here’re few example of SMART goals of Secure Software:  “The software would have 
  - zero code vulnerability as reported by the static code scanner, 
  - log 100% users action at runtime 
  - notify the security administrator within 30 minutes of a predetermined suspicious activities 
  - shutdown all the services to access the “HIGH VALUED ASSETS” such as database that holds customers personal information”

1.2 Define a policy and procedure on how the software security activities would be integrated to the Software Development Life Cycle (SDLC). One of way of achieving that is to define a
“Toll gate” at every stages of SDLC and define the pass through criterion. Here’s a few example of passing criteria : 

- At the end of the Requirement phase, have a requirement review to verify that Security Requirements have been signed off by the Information Security Officer (ISO)

- At the end of the Development, an application code scanner is run and 100% critical vulnerabilities and 80% of Moderate vulnerabilities are resolved

- QA has executed 100% Security Test Cases and are passed with provable evidence 

1.3 Develop a Threat Model of the software being developed. A threat model is a way to understand and prioritize risks and evaluate mitigation possibilities. Steps to a threat model are: Identify assets, Understand systems, Understand threats, Categorize threats and Rank the threats” 
Few things need to be considered while analyzing the system (i.e. the software under construction) and the associated threats. Those are: detail understanding of the underlying technologies used in the software, vulnerabilities and risks of the technologies being used, and the target market segment. Through the development of a threat model, the mitigation plan would be created. Usually the high ranked threats are mitigated and moderate and low ranked threats are kept documented so that when the threats are materialized, an immediate response can be put in place 

1.4 Making sure that the software development plan has incorporated the security artifacts in it 

1.5 Develop and Security Test Plan & Strategy of the software in accordance to the Software Threat Model. The strategy would define on how the software would be tested while being developed during the execution phase

1.6 Documenting the Security assumptions. Though this is true for every other aspects of the SDLC but for security, this rank very high and absolutely critical. The reason is, the software is usually built for a target user group in a certain operating environment but eventually  may end up being used in a completely different environment. As an example, software that  was developed to serve internal customers within a corporate security peripheral may end 
up being used over the internet eventually when the company grows across geographical location around the world. In that situation, the software’s vulnerability should be re-
evaluated but if the initial assumptions weren't documented, it may not be handled at the time of expanding the scope

1.7 The Security Response Plan (SRP) would have to be created as part of planning process. The SRP would be used when the software would be in use .The SRP should elaborate the detail procedure on responses when the software security threats are materialized as well as the roles and responsibilities of the security response team. Security Requirement has been signed off by the Information Security Officer (ISO) vulnerabilities and 80% of Moderate vulnerabilities are resolved.

2. EXECUTION

In the execution phase, all the action plans created during the planning phase would be implemented. At every steps of the standard Software Development Life Cycle (SDLC), the security aspect would be considered and implemented. 

Requirement Analysis: the first step of the SDLC is the requirement analysis phase and security has to start from there as well. Along with the standard Function and Non-functional requirement analysis, a detailed security requirement analysis would be done. This security requirement analysis would start with standard authentication and authorization requirements of the software and then continue to develop the other requirements in light of the previously developed Software Threat Model. This is very important to document all the functional and non-functional security requirements (sometimes dubbed as abuse cases) so that the software can be verified to have those requirements built in to the system. Below is the high level breakdown of execution process and the outcome artifacts:



Design and Development: the development of the software starts with the design of software. The security requirements of the software would be designed in parallel to the software’s functional behavior. If the security requirements are not designed upfront in the process, the security would be used as a band aid to the system rather than built in to the system. In certain situation, the design of functional requirement of the software would be influenced by the security designs. As an example, the way the software would access the customer’s personal confidential data (if available in that system) would be heavily dictated by the security requirement of the system. Here’re few example of design guideline to ensure the software security:

- Appropriate encryption has to be in place to prevent sniffing. This could mean to use of SSL, or if need further security, second layer of encryption can be enforced for certain assets that require higher level of confidentiality


- Data masking would be implemented to have the data visible to the people strictly as need to know basis

- Enforce the use of software library and frameworks that naturally prevents certain security threats, or if further security is needed, a second layer of encryption can be enforced for certain assets that require higher level of confidentiality. As an example, the use of Object Relational Mapping (ORM) framework to access RDBMS database will put an extra level of protection against SQL Injection 

- Every access and access-attempts would be logged 

- The movement of confidential data needs to be traced. The inbound and outbound data movements are to be logged with certain detail (e.g. user ID, computer terminal, geographic location, time zone etc.).

During development, the developers need to use a software security static analyzer (e.g. IBM Security AppScan Source) while performing developer’s unit testing. The software functionality can only be released when all the critical vulnerabilities are resolved. Though the zero vulnerability reported by a static analyzer does not guarantee secure software, but at least this is a good starting point where all the known security issues are handled. Much of security vulnerabilities are exploited through a copy book attack of the well-known vulnerabilities. So there’s no excuse to slip the well-known vulnerabilities. 

Testing: the quality assurance of the software is done through testing phase and the incorporation of the security testing is necessary to develop secure software. The Testing process comprises of two phases: Test Strategy and Test Execution. The Test Strategy has been developed as part of the planning phase and now during the Execution phase, the Security Test Cases are created and subsequently executed on the developed software. The successful execution of the Security test cases would be a precondition to release to software. 

Deployment: This is the last phase of the SDLC when the other aspect of software security would be used to maintain the security of the system. It is imperative to say that, the highly secured software can be deemed vulnerable and at risk due to the substandard deployment environment. The deployment security consists of: 
    - Physical security of the server and network systems
    - Security of Operating systems of the server (through operating system hardening) 
    - Security of the platform consists of application server, database server, web server etc. 
    - Security of the computer network 
    - Security of end users computing platform (when possible)

3. MONITORING

Unlike most of the other software development activities (e.g. analysis, design, development, testing etc.), the software security process does not stop at the completion of software development. Software needs to be kept under constant monitoring. The primary reason behind the need to keep the software under constant monitoring is that, no amount of software security test can be enough to declare any software as free of security vulnerability or as free of security risk. To understand why it’s almost impossible, take this hypothetical scenario: consider that the software uses the Advanced Encryption Standard (AES) 128 bit encryption method and at the time of the software delivery, the testing was done using the computational resources available and found unbreakable in a reasonable time period. But during the life of the software, more powerful computer would be available in a much cheaper cost (Moore’s law has guaranteed that) or a smart hacker would emerge who would break that encryption method using a smarter algorithm. That means the security group has to keep the software in monitoring to make sure that the software is not compromising the security in daily basis. Below are some of the areas that would need active monitoring: 

3.1 The application code base should be periodically scanned for added vulnerabilities. This is very crucial, as even though the software may have been released with zero vulnerabilities but over the period of its life, new code would be added and sometime those newly added code could introduce new vulnerabilities into the piece of code that were previously deemed as secured

3.2 Monitoring the authorized users access log if they are accessing the assets that they are authorized to. Sometime the authorized users could gain access to the confidential functionality or data because of a software bug, administrative mistake, etc. 

3.3 Monitoring unauthorized access as well as the attempt to unauthorized access to the system. The logs of “attempt to unauthorized” access gives clue to the security group on potential vulnerabilities point where malicious users are trying to break in

3.4 The logs of the inbound and outbound data movement in the software should be under constant monitor. The unusual movement of data may be an indication of security breach 

3.5 Server resource utilization have to be kept under constant monitoring. The use of CPU, memory, disks space, network I/O etc. could help to identify a potential abuse of the software. For example, if the software process is taking up unusually high CPU time or using more memory than the trend and don’t have any reasonable cause for that, this gives a reason to investigate that high usage

3.6 Periodical Pen testing, Stress testing, and Load Testing should be conducted to probe the security vulnerability of the system

3.7 Prefer automated monitoring over manual one. Manual monitoring is not sustainable. Manual monitoring can be put in place as an exception basis and only in the case where automated monitoring is impossible or not cost effective

3.8 Software security should be quantified. Though it’s very hard to tag a number to communicate the security level of software but it’s not impossible to achieve. The aspects of monitoring can be calibrated into a scale and compare against that scale. Through which it
would be much easier to compare the improvement or declined of the software security risks. There is downside of quantifying software security as it may give a false sense of security but the benefit of that outweighs the disadvantage

3.9 Finally, prefer the visual presentation of the monitoring results over text based result. Human brain is naturally tuned to visual cue than text interpretation

4. CONTROLLING

There is no value of monitoring if an effective controlling process is not in place. In the controlling process the outcome of monitoring would be used to take necessary action to prevent security threat and provide feedback to the planning process to improve the software’s overall security. 

The primary goal of this phase is to mitigate the potential threat or minimize the damage done due the materialization of a potential security threat. For example, if monitoring of software detects an unusual activity of confidential data movement or a very high usage of server resources (CPU/memory), as an extreme measure, the software could be temporarily shut down to minimize the damage and then after proper investigation this can be reopen if deemed secured. The controlling also involves applying software bug fix, security patch etc. in a timely manner to proactively thwart potential risks. 
In conclusion, the software security in Software Development is more of a culture than a process and can’t be attained at that level unless an organization fully appreciates the value of it. Software organizations should accept the fact that the cost is very high of not putting focus on the security which could even be go up to bankruptcy. Embracing the software security in every aspect of software development life cycle is the key to have the security built into it. If the security is a module that is plugged into the software rather than built into the fabric of the software, the attackers can isolate the security module and force the software to compromise to their attacks. Every piece of the code must have to be security aware and only through that a robust software security can be achieved.