Skip to content

FCS R Syntax Oct 2025#222

Open
odai-saleh wants to merge 1 commit intomainfrom
R-syntax-for-the-new-syntaxes
Open

FCS R Syntax Oct 2025#222
odai-saleh wants to merge 1 commit intomainfrom
R-syntax-for-the-new-syntaxes

Conversation

@odai-saleh
Copy link
Contributor

No description provided.

Copy link
Contributor

@ValerioGiuffrida ValerioGiuffrida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@odai-saleh changes advised below should deliver a shorter and more appropriate code to the data processing -> indicator calculation step.

FCS_flag_low = ifelse(!is.na(FCS) & FCS < 14, 1L, 0L),
FCS_flag_high = ifelse(!is.na(FCS) & FCS > 100, 1L, 0L)
) %>%

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These scripts are for processing data, not flagging data quality issues.
Data quality control is a different step.
@odai-saleh please remove rows above.

Comment on lines +70 to +91
# ------------------------------------------------------------
# Data Quality Diagnostics
# ------------------------------------------------------------

# Descriptives for FCS
summary(df$FCS)
sd(df$FCS, na.rm = TRUE)

# Flag frequencies
table(df$FCS_flag_low, useNA = "ifany")
table(df$FCS_flag_high, useNA = "ifany")

# Category distribution
table(df$FCSCat28, useNA = "ifany")

# Sample check of food group distributions (min/max/mean)
food_group_stats <- df %>%
summarise(across(all_of(fcs_vars),
list(min = ~min(.x, na.rm = TRUE),
max = ~max(.x, na.rm = TRUE),
mean = ~mean(.x, na.rm = TRUE))))
food_group_stats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@odai-saleh same rationale as above.
Data quality is a different step in the process, hence should rely on a different code/script.

Comment on lines +41 to +47
# ----------------------------------------------------------
# 3) Clean impossible FCS values (should be 0-112)
# ----------------------------------------------------------
mutate(
FCS = ifelse(FCS < 0 | FCS >= 113, NA_real_, FCS)
) %>%

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the code of section one, this is redundant as it checks a mathematical impossibility.
@odai-saleh kindly remove

Comment on lines +41 to +47
# ----------------------------------------------------------
# 3) Clean any impossible FCS values (should be 0-112)
# ----------------------------------------------------------
mutate(
FCS = ifelse(FCS < 0 | FCS >= 113, NA_real_, FCS)
) %>%

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given logical cleaning done in step 1 of this code, the above is a mathematical impossibility.
@odai-saleh kindly remove these lines.

Comment on lines +48 to +56
# ----------------------------------------------------------
# 4) Data quality flags
# low if FCS < 14; high if FCS > 100
# ----------------------------------------------------------
mutate(
FCS_flag_low = ifelse(!is.na(FCS) & FCS < 14, 1L, 0L),
FCS_flag_high = ifelse(!is.na(FCS) & FCS > 100, 1L, 0L)
) %>%

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data quality control is a different step in the process.
@odai-saleh Kindly remove rows above.

Comment on lines +71 to +92
# ------------------------------------------------------------
# Data Quality Diagnostics
# ------------------------------------------------------------

# Descriptives for FCS
summary(df$FCS)
sd(df$FCS, na.rm = TRUE)

# Frequencies of flags
table(df$FCS_flag_low, useNA = "ifany")
table(df$FCS_flag_high, useNA = "ifany")

# Distribution of final categories
table(df$FCSCat21, useNA = "ifany")

# Sample check of food group distributions: min/max/mean
food_group_stats <- df %>%
summarise(across(all_of(fcs_vars),
list(min = ~min(.x, na.rm = TRUE),
max = ~max(.x, na.rm = TRUE),
mean = ~mean(.x, na.rm = TRUE))))
food_group_stats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above on data quality
@odai-saleh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants