Question

Question 9.1 Half-precision Floating-point Format (50 marks) Do some research and find out how real (floating...

Question 9.1 Half-precision Floating-point Format

Do some research and find out how real (floating point) numbers are represented in Binary.

(a) (10 =6+4 marks) Devise your own 16-bit representation for floating point numbers. Draw a diagram of your representation and explain what the various bits are used for.

Explain in detail:

(i) How many bits are allocated to the mantissa and the exponent, respectively?

(ii) What defines the range and the precision (or accuracy) of the numbers stored in floating point notation?

Homework Answers

Answer #1

Half precision floating point numbers use 16 bits for storing the number, that is 2 bytes. In memory out of the reserved bits for Floating point numbers some bits used to store the fractional part called the Mantissa and some bits are used to store the where the fractional point is in the number called the Exponent. The left most bit also called the most swignificant bit is used for sign representaion, which is 0 for positive number and 1 for negative numbers.

a) We can use 16 bit numbers to store floating point number using 10 bits for mantissa, 5 bits exponent and the most significant bit for sign bit as shown in figure below.

.

i) For mantissa 10 bits are allocated and for exponent 5 bits allocated. That means we can store number

ii) Floating point numbers the exponent defines the range and the mantissa defines the precision.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
Matlab uses IEEE double precision numbers: 64-bit floating point representation 1 bit : sign 11 bits:...
Matlab uses IEEE double precision numbers: 64-bit floating point representation 1 bit : sign 11 bits: exponent 52 bits: mantissa. Calculate largest number that can be stored accurately Calculate smallest number (x>0) that can be stored accurately Calculate the machine epsilon Show all work step by step and explain calculations Now calculate the largest number and smallest number for a 10 bit floating point (1 bit for the sign, 4 bits exponent and 5 bits mantissa)
Matlab uses IEEE double precision numbers: 64-bit floating point representation 1 bit : sign 11 bits:...
Matlab uses IEEE double precision numbers: 64-bit floating point representation 1 bit : sign 11 bits: exponent 52 bits: mantissa. Calculate largest number (less than inf) that can be stored accurately Calculate smallest number (x>0) that can be stored accurately Calculate the machine epsilon Show all work step by step and repeat for 10 bit floating point (bit sign, 4 bits exponent and 5 bits mantissa)
Concern the following 16-bit floating point representation: The first bit is the sign of the number...
Concern the following 16-bit floating point representation: The first bit is the sign of the number (0 = +, 1 = -), the next nine bits are the mantissa, the next bit is the sign of the exponent, and the last five bits are the magnitude of the exponent. All numbers are normalized, i.e. the first bit of the mantissa is one, except for zero which is all zeros. 1. How many significant binary digits do numbers in this representation...
Find the internal representation of the following decimal number in the Single Precision Floating Point format...
Find the internal representation of the following decimal number in the Single Precision Floating Point format of the value: -17.6 Non-terminating fractions should be carried out 6 places. You will show the different steps involved in this transformation by filling out the fields below. The value 17 in binary is ___ 2 (no leading or trailing zeroes). The value .6 in binary is ____ 2 (complete to 6 places) Normalized fraction: 1.____ 2 x 2Exponent. Exponent=_____. Biased Exponent in Binary:...
Given a 12-bit IEEE floating point format with 5 exponent bits: Give the hexadecimal representation for...
Given a 12-bit IEEE floating point format with 5 exponent bits: Give the hexadecimal representation for the bit-pattern representing −∞−∞. Give the hexadecimal representation for the bit-patterns representing +0 and -1. Give the decimal value for the floating point number represented by the bit-pattern 0xcb0. Give the decimal value for largest finite positive number which can be represented? Give the decimal value for the non-zero negative floating point number having the smallest magnitude. What are the smallest and largest magnitudes...
Using the IEEE single-precision floating point representation, find the decimal number represented by the following 32-bit...
Using the IEEE single-precision floating point representation, find the decimal number represented by the following 32-bit numbers, each expressed as an 8-digit hex number. Express your answer using decimal scientific notation. (a) (C6500000)16 (b) (31200000)16
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT