Although not shown on the ClusterBuster motif format page:
https://bu.wenglab.org/cluster-buster/help/cis-format.html
Cluster-Buster supports reading Cluster-Buster motif files with float
values as opposed to integer values, just fine.
By default integer values are written
motifs.write(motifs, "clusterbuster")
but when precision different from 0 is specified, float values will be written:
motifs.write(motifs, "clusterbuster", precision=2)
* Support degenerate consensus sequences for RNA motifs.
Support degenerate consensus sequences for RNA motifs.
* Support parsing minimal MEME files with RNA motifs and motifs without all statistcs provided in the letter-probability matrix line.
Support parsing minimal MEME RNA motif files.
Support parsing mininal MEME motif files where not all statistics
are provided in the "letter-probability matrix:" line:
- As seen in the minimal MEME motif file specification:
http://web.mit.edu/meme/current/share/doc/meme-format.html#min_motif_pspm
a "letter-probability matrix:" line is required, but all the "key= value"
pairs after the "letter-probability matrix:" text are optional.
- The "alength= alphabet length" and "w= motif length" can be derived from
the matrix if they are not specified, provided there is an empty line
following the letter probability matrix.
- The "nsites= source sites" will default to 20 if it is not provided and
the "E= source E-value" will default to zero.
- It is relatively common to find minimal MEME motif files without E-value:
For example, HOCOMOCO v11 minimal MEME file:
https://hocomoco11.autosome.org/final_bundle/hocomoco11/core/HUMAN/mono/HOCOMOCOv11_core_HUMAN_mono_meme_format.meme
Fix assertAlmostEqual calls in tests, when checking for very small numbers
in scientific notation by adding "places" parameter with enough precision.
This mostly affects comparing E-values between test files and parsing of
those files. Quite some cases were actually checking for the wrong value.
* Add support for reading PFM from Cys2His2 Zinc Finger Proteins PWM Predictor.
Add support for reading PFM from Cys2His2 Zinc Finger Proteins PWM Predictor
(http://zf.princeton.edu/logoMain.php).
* Capitilize motif subtypes for "pfm-four-columns" and "pfm-four-rows" correctly.
Capitilize motif subtypes for "pfm-four-columns" and "pfm-four-rows" correctly.
* Set empty motif name string when no motif name is found when reading PFM files in "pfm-four-columns" format.
Set empty motif name string when no motif name is found when
reading PFM files in "pfm-four-columns" format.
* Add reading PFM in "pfm-four-columns" and "pfm-four-rows" to motifs tutorial.
Add reading PFM in "pfm-four-columns" and "pfm-four-rows" to motifs tutorial.
* Implement XML parsing for MEME
Update parser to work with XML rather than plain-text MEME output.
This commit retains the functionality of the current parser.
TODO's are added where more information can be parsed with XML.
* Fix instance start value
* Update MEME test case 3 with XML file
* Add function to convert strand (+/-)
* Fix stand for MEME test case 3
* Update MEME test case 3 with XML file
* Remove abandoned MEME test files
* Minor formatting fixes
* Update MEME test case 4 with XML file
* Remove MEME tests 1, 4, 4_11_4, RNA
TODO:
- add test for -rna option
- add tests for other MEME versions supporting XML output
* Update docstring examples
* Fix docstring styling
* Update NEWS.rst
* Update parse error
* Remove .html and .txt test files
* Remove abandoned MEME test files
* Rename function
* Update with sequence_id, sequence_name
Use dictionary to store map of sequence id : name
since name is not included under the instances tree
* Update copyright headers
* Update docstring example
* Update docstring example, add test file with 1 motif
* Update motifs documentation inline examples
* Restore meme.out
Used in Bio.motifs.minimal.read() docstring example
* Apply black style to meme.py
* replaced the format method with __format__
* trivial change to restart travis
* noqa to find out what makes travis sad
* travis is having a bad day
* doctest updates
* travis false negative
* fix doctest typo
* fix doctest typo
* now we're getting somewhere
* update tests to silence warning
* one more old format
Closes issue #1980
Replaced the old text MAST output parser which was written for:
MAST version 3.0 (Release date: 2004/08/18 09:07:01)
Instead the new parser expects MAST XML output.
Test cases updated using http://meme-suite.org/meme-software/5.0.5/meme-5.0.5.tar.gz
Done with the following ad-hoc script:
"""In-place fix for D100 Missing docstring in public module."""
import os
import sys
def fix(lines, docstring):
answer = []
# 0 = preamble
# 1 = waiting for it
# 2 = seen module docstring
state = 0
for line in lines:
if state == 0 and line.startswith("#") or not line.strip():
# hashbang or licence block
pass
elif state == 0 and line.lstrip().startswith('"""'):
state = 2
elif state == 2:
pass
else:
# Insert docstring
answer.append('"""%s"""\n\n' % docstring)
state = 2
answer.append(line)
return answer
for f in sys.argv[1:]:
print(f)
with open(f) as handle:
old = list(handle)
name = os.path.basename(f)
assert name.endswith(".py")
name = name[:-3].replace("_", " ")
if name.startswith("test "):
name = "Tests for %s" % name[5:]
if not name.endswith(" tool"):
name += " module"
new = fix(old, name + ".")
with open(f, "w") as handle:
for line in new:
handle.write(line)
print("Done")