6/05/17 Notes (Combining First Name and Last Name into one column)


Notes

  • Trying to make “First Name” and “Last Name” both id words (instead of just first name)
    • Tried defining “firstlast as ‘First Name’ & ‘Last Name’ then merging using ‘firstlast’ this did not work.
    • Tried using a comma instead of &. This did not work.
    • Tried using syntax I found online. This did not work.
      • left_on='First Name' and 'Last Name' right_on='First Name' and 'Last Name')
    • Tried deleting the space between “and” this did not work.
    • Tried replacing “and” with a comma. This did not work.
    • Tried typing “AND” instead of “and” this did not work
  • I couldn’t find anything that worked to make two id words so instead I’m trying to find a way to merge the columns “First Name” and “Last Name” together so I can have one “Name” column.
    • Tried using this syntax I found online. This did not work
      • ConfidenceS14['Name'] = ConfidenceS14['First Name'].map(str) + ConfidenceS14['Last Name']
    • I also tried  replacing ‘’ with “” this did not work.
    • Tried “ConfidenceS14[‘Name’] = ConfidenceS14[‘First Name’, ‘Last Name’] this did not work.
    • Tried a different command I found online. This did not work
      • df = pd.ConfidenceS14({'First Name': [[:]], 'Last Name': [[:]]})
df['Name'] = df[['First Name', 'Last Name']].apply(lambda x: ''.join(x), axis=1)
    • Tried replacing the “:” in the command above, with the students’ names. This did not work.
    • Tried mimicking another syntax online. This did not work.
      • df_merged = pd.merge(df1, df2, left_on=['name_indexcolumn_df1_here'], right_on=['name_indexcolumn_df2_here'], how='inner')
  • Tried changing the names of the columns “First Name” and “Last Name” to “Name” and then merging with the word “Name” this did not work.
  • Tried making a new dataframe that contained the first and last names, and then appending that to the Confidence survey data frame, and the Attitudes Survey data frame, so then I can merge the new Confidence dataframe with the new Attitudes dataframe, by first and last name.
  • Made new data frame using this command
    • Names14 = {
       'Name': [insert all the names here]}
df_b = pd.DataFrame(Names14, columns = ['Name'])
df_b
    • This worked
  • I tried to merge the name data frame with the Confidence survey using the command “pd.concat([Names14, ConfidenceS14], axis=1)” this gave me an error saying “TypeError: cannot concatenate a non-NDFrame object” meaning that the two data frames have different dimensions. To fix this I tried adding duplicate names back to the “Name” data frame. This did not work
  • I tried merging the data frames mimicking this command “In [17]: s1 = pd.Series(['X0', 'X1', 'X2', 'X3'], name='X')
    In [18]: result = pd.concat([df1, s1], axis=1)” This made the cells merge, but the names did not match up perfectly.
    • I added the phrase “NaN” between some of the names (this phrase stood in for duplicate people) and this worked!
  • I am going to try to merge the cells before I delete the duplicates, so I don’t have to worry about matching up the correct name to the correct column #
  • I tried merging  “Confidence14” with “Attitudes14” (this is what I named the new data frames with full names) and it did not work because the column name for the full names was “0” and this was a problem for some reason.
    • I tried renaming the columns the same way I did earlier in the code, but the renamed columns won’t appear.
    • Instead of typing all the column names, I wrote “iloc[:,:]” and that worked!
    • Merging the two data frames worked! This process is also a lot of work though, and I feel like I should try to find a simpler way to do this. Torri suggested trying to use an If statement in the merge function  
  • I tried using this command I found online
  • df['combined']= df['first name'] + ' ' + df['last name']
  • This displays a column with the first and last names, but I can’t get this column to appear with the other columns
  • I tried merging anyways using this code “ConAtt14 = pd.merge(left=Confidence14,right=Attitudes14, left_on='Full Name', right_on='Full Name')

Merged_inner
# what's the size of the output data?

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
Merged_inner.shape
ConAtt14

  • This worked! Now I have the confidence and attitude data merged by the full name of the student

Comments

Popular posts from this blog

6/14/17 Notes

6/06/17 Notes (formatting code and searching for a specific person)

May 22, 2017 -- SienaSemanticsSurvey -- Code Breakdown -- Cell #8