course TOC

2.5. Introduction to Computers

Data sizes and Speeds [src]

Names for different sizes of data

When choosing a new computer we come across terms such as "1 TB hard drive" and "4 GB RAM", and to the uninitiated, this can be somewhat disconcerting. Data in a computer is represented in a series of bits (binary digits) or ones and zeroes. Since the birth of computers, bits have been the language that control the processes that take place inside that mysterious black box called your computer.

collection of screenshots showing common use of terms to be covered
Screenshots of various sizes and rates.
By Paul Mullins: constructed/screenshot

Understanding what is bigger or faster, and how much so, will play a role in what you buy and how cost effective it is. In the image shown, various specifications that are commonly seen are shown. What do they mean? Does this indicate good or bad? What is better, 8 Mbps (also written as 8 Mbit/s) or 2 MBps?

Bit (Binary Digit)

A bit is simply a 1 or a 0. A true or a false. It is the most basic unit of data in a computer. It's like the dots and dashes in Morse code for a computer. It's also called machine language.

Any data of any kind that is stored in the computer or transmitted by the computer is ultimately made up of bits. A program (software) written in a high-level (human readable) language like Java or C++ is converted to machine language (bits) before the computer can run it.

A bit can represent anything we want, perhaps yes and no, but it has only two possible values. So, to represent more things, we have always grouped bits into larger chunks. The number of bits determines some maximum number of unique combinations of bits. A group of 8 bits has 256 (28) possible unique combinations. Each of those combinations can have its own meaning that we agree upon.

In the Morse code analogy, suppose we decided that every letter was a combination of five dots and dashes. That would provide use with 32 (25) unique code values. That is enough to represent the 26 letters of the alphabet and 6 more, perhaps punctuation. The Chinese might use the same 32 codes to represent something besides our alphabet, which is okay, as long as we know what language we are supposed to be reading. [By the way, Morse Code doesn't actually work that way.]

Byte

A byte is a basic unit of measurement of information storage or transmission that consists of 8 bits. It can be used to represent letters and numbers – up to 256 of each. For example, a Byte containing the 8 bits 01000101 represents

  • the letter E in the ASCII character set, or
  • the number 69, since 26 + 22 + 20 = 69.
There are many things the same pattern of bits could represent – as long as we all agree on the representation or rules for understanding it – like part of one pixel in an image. There are usually three parts to a pixel, one byte for red, another for green and the third for blue. All together that is 24 bits, so we can represent a total of 224 or 16 million (approximately) colors.

Twitter allows messages of up to 140 characters (Bytes), while SMS (Short Message Service) or cell phone texting allows up to 160 Bytes. Most text only emails can be measured in Bytes, i.e., the emails are relatively small.

bit and byte
Bit and Byte
By Paul Mullins: constructed image


pixel as bytes
Pixel: One byte each for the additive primary colors: Red, Green & Blue
By Paul Mullins: constructed image
wikipedia icon byte
Ignore the pedants. A byte has been exactly 8 bits for at least 30 years.
search icon bits vs bytes video icon bits vs bytes
Wow! Seriously?

Word

The size of a computer "word" is variable. It is based upon how many bits the CPU can read at one time, and that has changed over time. We are currently (c. 2011) in a transition from 32-bit systems to 64-bit systems. Our only concern with this is that a 32-bit system can directly support (address) at most 4 GigaBytes of RAM (232). To use more RAM we need to "trick" the computer or get a 64-bit system. Oddly, most current computers are also right around 4 GB of RAM, so this knowledge is of immediate value in purchasing a new system. You want a 64-bit system!

We got a bit ahead of ourselves there, so let's back up...

KB or KiloByte

One KB is 1000 Bytes, at least if you don't wear the title "geek" as a badge of honor and especially if your in marketing. When measuring disk space, one KB is 1024 bytes (210). Obviously, if you were selling a disk drive that could hold exactly 1000 bytes, it would be to your advantage to market it is a 1KB drive.

A typical word processor document, without a lot of images, is likely to measure in the KB range.

MB or MegaByte

A megabyte is approximately 1000KB. Technically, it is 1024KB (1024 x 1024). Again, the technical value is of interest primarily to geeks.

Most digital cameras create images that are in the MB range. An 8 MP (megapixel) camera has 8 million sensors, each representing a pixel or picture element. Each pixel, in turn uses some numbers of bits to represent the various colors. A "true color" camera has 24 bits (3B) per pixel. Assuming no compression, each image would be 24MB! (In practice, we usually do compress the images and find them to be between two and eight MB.)

GB or GigaByte

A gigabyte is a unit of data storage worth approximately a billion bytes, meaning either 1000MB or the more technical 1024MB (1024 x 1024 x 1024). More often than not in advertising, Gigabytes are presented as 1 billion bytes and not 1,073,741,824 (It's only off by 7%.). This helps to explain why a freshly formatted 500GB hard drive shows up at a 450GB drive instead. Not too long ago many people were discussing RAM and even disk storage in Megabytes. These days, storage has become so cheap that having 4-16 gigabytes of RAM is considered the norm.

A DVD holds gigabytes of data, enough for a single digital movie. So, keep in mind that storing movies on your hard drive will take large chunks of space.

TB

A terabyte is 10244 and is defined as about one trillion bytes, or 1024 gigabytes. Data centers such as those operated by Google handle thousands if not millions of terabytes of data each day. As storage becomes cheaper and faster, terabytes are becoming a commonly heard term.

Disk drives in in the TB range are now common. (You can buy them at Walmart!)

PB or PetaByte

A petabyte is a unit of information or computer storage equal to one quadrillion bytes (10245).

Google processes (c. 2008) about 24PB of data per day. See map of Google data centers.

tabular presentation of various sizes
By Paul Mullins: constructed image

Measurements of Data Speed

Today there are generally two ways of describing data transfer speeds: in bits per second, or in bytes per second. As explained above, a byte is made of 8 bits. Network engineers still describe network speeds in bits per second, while your Internet browser would usually measure a file download rate in bytes per second. A lower case "b" usually means a bit, while an upper case "B" represents a byte. Hence, the answer to which is better 8 Mbps or 2 MBps? Is 2 MBps (which is 16 Mbps). Using a less common, but more clear notation: which is better 8 Mbit/s or 2 MBps? Answer: 2 MBps, since that is 16 Mbit/s. (Marketing people use this confusion to their advantage, if you're not sure which is intended, ask.)

bps

Known as bits per second, bps was the main way of describing data transfer speeds several decades ago. Bps was also known as the baud rate, therefore, a 600 baud modem was one which could transfer data at around 600bps.

Kbps

kilobits per second, or 1000 bits per second. (Network folks didn't get caught up in the 1000 vs 1024 problem.) Modern telephone modems operate at 56Kbps.

Mbps

1,000,000 (million) bits per second. Often used in describing Internet download/upload speeds, as shown above.

Gbps

1,000,000,000 (billion) bits per second. This term is most commonly heard in local area networks, where the close proximity of machines allows for fast data transfer rates.

Check your connection

How do you know if you're getting what you're paying your Internet service provider for? Try your own system at speedtest.net.

Internet map 1024
Partial map of the Internet based on the January 15, 2005 data found on opte.org. Each line is drawn between two nodes, representing two IP addresses. The length of the lines are indicative of the delay between those two nodes. This graph represents less than 30% of the Class C networks reachable by the data collection program in early 2005.[src]
By The Opte Project [CC-BY-2.5], via Wikimedia Commons

Modem mp3h0655
Acoustic coupler modem
Rama [CC-BY-SA-2.0-fr], via Wikimedia Commons
wikipedia icon bit rate search icon dsl vs cable internet lab rats video #44 dsl vs cable: what to buy

course TOC


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Attribution: Dr. Paul Mullins, Slippery Rock University
These notes began life as the Wikiversity course Introduction to Computers.
The course draws extensively from and uses links to Wikipedia.
A large number of video links are provided to labrats.tv. (I hope you like cats. And food demos.)