Fix moonlight z-score and updated datasets accordingly#169
Open
AlessiaCampo wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The present PR contains changes to fix the Moonlight Z-score calculation within the FEA() function. The fix include the replacement of the experimental
Exp.Log.Ratiotaken from theDiseaseListdataset with the experimentallogFCtaken from the inputDEGsorDAPsmatrix.Since the FEA function now implements two different statistical method to perform the analysis (namely
Fisher testora-based method andfgseamethod), the moonlight z-score calculation has been updated to make sense in both cases:fgseamethod, the Moonlight Z-score calculation is computed over all the genes that are part of theleadingEdgelist, which is the list of genes found to be enriched in a given BP and are predicted to be the ones contributing more to the enrichment of the BP.The difference between the two scores is the population of genes used to compute the z-score: in the ora-based method we assume that the input data matrix has been filtered to contain only the significant DEGs, while the fgsea method consider all the genes in input for the enrichment calculations. For this reason, we only take into account the leading edge genes (the most relevant ones) for the score calculation.
Linked to the change of the moonlight z-score, additional updates were made on the datasets stored in
data/:dataFEAcontaining the new scoresdataURAwhich uses FEA internallydataPRAwhich uses dataURA as inputTo make run tests and examples smoothly, toghether with the new Z-scores, the
dataGRNwas updated to include more TFs so that the other downstream analysis and function using these data (i.e. URA, FEA, PRA, TFinfluence) would work with no issues.