A computer worm is a type of malware that is characterised by their ability to replicate in order to spread.Though at first look the above definition seems similar to that of a Computer Virus, there are differences. Looking at the way replication occurs, a virus replicates by adding a copy of its genetic material (code) into a executable program or a file that can carry it to suitable environments, whereas a worm does not reside in the file system (usually only use the CPU and memory) but is carried over using vulnerabilities in the operating system or application. For eg. the CodeRed worm exploits a buffer overflow to propagate over the network. From the point of view of propagation models (with individual hosts as a single unit), viruses spread much slower than worms, a major reason being the primary function of a virus is to replicate whereas that of a worm is to propagate.
Having established the differences let’s take a closer look at worms. Though most worms are created with the sole intention of propagating and do not modify the host system, there have been cases where the worms also carry and execute a payload. Since most worms propagate using networks, worms can be classified based on the transport layer protocol that they use to scan and propagate for vulnerable systems into TCP-based worms and UDP-based worms. TCP-based worms are said to be “latency-limited”, whereas UDP-based worms are said to be “bandwidth limited” . This is definition is preliminary and based on the fact that TCP-based worms first need to complete a 3-way TCP handshake before data transfer that begin which makes latency a factor whereas for UDP-based worms, scanning can happen at the full range of the bandwidth of the connection which makes the bandwidth the limiting factor.
Since seeking a correlation between biology and computer science has always been an active area of research, the spread of worms can be modelled by real life spread of bacteria. One of the propagation models describes the spread using a first order differential equation : Nda = (Na)K(1-a)dt where a is the ratio of infected to vulnerable machines, K is the initial compromise rate, N the number of vulnerable machines and t represents time. This equation is known as the logistic growth model which is characterised by exponential growth rapidly in the beginning followed by a slowdown as a plateau is reached. This obviously is a idealistic eqation since it does not factor in the fluctuations in N due to network unavailability or patched systems. The equation also does not consider the variation in speed of propagation due to the varying speed of connections between different hosts.
Common methods of protection against worms are signature based detection of network Intrusion Detection Systems and also anomaly detection based implementations at the hosts since worms generally scan aggressively thus making monitoring for quick query bursts, a possible factor to flag anomalous behaviour. Honeypots are also used to capture new worms in the wild to try and profile them quickly before it spreads widely.
 Open Problems in Computer Virology by Eric Filiol, Marko Helenius, Stefano Zanero