Reduce rows in data frame when computing variable?

Question

Reduce rows in data frame when computing variable?

in_cognito

2022年5月13日 08:22

Would anyone have an idea as to how I can change this code so that I get to a data frame that only has one row per year?

# Step 1. Weighted sum = Sum(sum of mean severity ratings * the repetition of each word near trauma)

df_word - df_word %% # compute the product of mean sum ratings (AV,A,V) * lemma repetitions
  mutate(AV.prod=(AV.Mean.Sum*repet),
         A.prod=(A.Mean.Sum*repet),  
         V.prod=(V.Mean.Sum.R*repet))

df_word - df_word %% # group by year and sum for each group (AV,A,V) (numerator)
  group_by(year) %%
  mutate(sumAVprod.word = sum(AV.prod),
         sumAprod.word = sum(A.prod),
         sumVprod.word = sum(V.prod)) %% ungroup()

# Step 2. Standardize: Weighted average = sum of(repetition-weighted severity: AV,A,V) by lemma/ sum(repetitions by year)

df_word - df_word %% # sum repetitions by year (denominator)  
  group_by(year) %%
  mutate(sum_repet_word=sum(repet)) %% ungroup()

df_word - df_word %% # compute standardization (for AV,A,V)
  mutate(sev_word=(sumAVprod.word/sum_repet_word),
         aro_word=(sumAprod.word/sum_repet_word),  
         val_word=(sumVprod.word/sum_repet_word))  

head(df_word) # verify values by manually calculating 1-3 rows of `sev_word`

current data frame (too many rows)

desired data frame

Topic r

Category Data Science

Reduce rows in data frame when computing variable?

About