Question

I am reading in a CSV file (using R). When I first check if there are...

I am reading in a CSV file (using R). When I first check if there are any NA's there are none. I then clean my data and convert my Income variable from num to factor by using this code to discretize income by equal-width bins:

min_income <- min(bd$income)
max_income <- max(bd$income)
bins = 3 
width=(max_income - min_income)/bins;
bd$income = cut(bd$income, breaks=seq(min_income, max_income, width))

When I complete cleaning/updating my data and check again for NA's I receive one. It is specific to row 65 for my income column. If I want to update the actual value in it, using the below code I receive an error.

> bd[65,5] = 5014.21
invalid factor level, NA generated

Is there a way to update this without having to change the type of variable? Why would it change the value to an NA (especially for only one value)? I have not come across this issue previously. I could just remove the row, but since I have the value I figured I should just use it.

Homework Answers

Answer #1

Check if this particular value is formatted as a string in your original CSV file or got formatted as a string when you imported. Maybe the decimal point in it is misrepresented (as comma etc.). In that case, you can simply fix the value in csv or in data frame before factoring.

Most probably it will solve your issue, If it does not, please let me know in the comments & maybe share a few rows up & down from the problematic row from you csv, I will try to replicate the issue & then solve it.

Solution 2:

If this is the lowest value, use the include.lowest parameter:

bd$income = cut(bd$income,include.lowest = TRUE, breaks=seq(min_income, max_income, width))

The screenshot below shows that NA does not occur after doing this.

(*Please up-vote if you find it helpful. If any doubt, please let me know in the comments)

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
I am trying to make a program in C# and was wondering how it could be...
I am trying to make a program in C# and was wondering how it could be done based on the given instructions. Here is the code that i have so far... namespace Conversions { partial class Form1 { /// <summary> /// Required designer variable. /// </summary> private System.ComponentModel.IContainer components = null; /// <summary> /// Clean up any resources being used. /// </summary> /// <param name="disposing">true if managed resources should be disposed; otherwise, false.</param> protected override void Dispose(bool disposing) { if...
C++. Write a program that draws a rocket shape on the screen based on user input...
C++. Write a program that draws a rocket shape on the screen based on user input of three values, height, width and stages. The type of box generated (i.e.. a hollow or filled-in) is based on a check for odd or even values input by the user for the box height (or number of rows). Here is the general specification given user input for the height of the box..... Draw a hollow box for each stage if the value for...
I completed everything except for the last part. I know my answers are right because i...
I completed everything except for the last part. I know my answers are right because i double checked but i just dont know how to answer the last section. Problem 3-1 Schedule C (LO 3.1) Scott Butterfield is self-employed as a CPA. He uses the cash method of accounting, and his Social Security number is 644-47-7833. His principal business code is 541211. Scott's CPA practice is located at 678 Third Street, Riverside, CA 92860. Scott’s income statement for the year...
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT