-
Notifications
You must be signed in to change notification settings - Fork 34
Description
hii dear @terhorst @willright28 I'm facing same issue "RuntimeError("Distinguished lineages not found in data?")
RuntimeError: Distinguished lineages not found in data?"
using example data mentioned in this github repository.https://github.com/popgenmethods/smcpp/blob/master/example/example.vcf.gz
smc++ vcf2smc example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
smc++ vcf2smc -d NA12878 NA12879 example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
for i in {7..9};
do smc++ vcf2smc -d NA1287$i NA1287$i example.vcf.gz out.$i.txt chr1 NA12877 NA12878 NA12890;
done
smc++ estimate -o output/ 0.1 out1.txt
kindly help me to solve
please check the header for this file and sample and population info. and suggest me changes to be do accordingly
###########
mylinux@ChiragsPC:~/smcppdata$ smc++ vcf2smc example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
2016 smcpp.commands.vcf2smc WARNING Neither missing cutoff (-c) or mask (-m) has been specified. This means that stretches of the chromosome that do not have any VCF entries (for example, centromeres) will be interpreted as homozygous recessive.
2020 smcpp.commands.vcf2smc INFO Population 1:
2020 smcpp.commands.vcf2smc INFO Distinguished lineages: NA12878:0, NA12878:1
2021 smcpp.commands.vcf2smc INFO Undistinguished lineages: NA12879:0, NA12879:1
[E::idx_find_and_load] Could not retrieve index file for 'example.vcf.gz'
Traceback (most recent call last):
File "/home/mylinux/.local/bin/smc++", line 8, in
sys.exit(main())
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/frontend/console.py", line 28, in main
cmds[args.command].main(args)
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/commands/vcf2smc.py", line 134, in main
raise RuntimeError("Distinguished lineages not found in data?")
RuntimeError: Distinguished lineages not found in data?
mylinux@ChiragsPC:~/smcppdata$ smc++ vcf2smc -d NA12878 NA12879 example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
2028 smcpp.commands.vcf2smc WARNING Neither missing cutoff (-c) or mask (-m) has been specified. This means that stretches of the chromosome that do not have any VCF entries (for example, centromeres) will be interpreted as homozygous recessive.
2029 smcpp.commands.vcf2smc INFO Population 1:
2029 smcpp.commands.vcf2smc INFO Distinguished lineages: NA12878:0, NA12879:1
2029 smcpp.commands.vcf2smc INFO Undistinguished lineages: NA12878:1, NA12879:0
[E::idx_find_and_load] Could not retrieve index file for 'example.vcf.gz'
Traceback (most recent call last):
File "/home/mylinux/.local/bin/smc++", line 8, in
sys.exit(main())
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/frontend/console.py", line 28, in main
cmds[args.command].main(args)
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/commands/vcf2smc.py", line 134, in main
raise RuntimeError("Distinguished lineages not found in data?")
RuntimeError: Distinguished lineages not found in data?
mylinux@ChiragsPC:~/smcppdata$ for i in {7..9};
do smc++ vcf2smc -d NA1287$i NA1287$i example.vcf.gz out.$i.txt chr1 NA12877 NA12878 NA12890;
done
usage: smc++ vcf2smc [-h] [-v] [--cores CORES] [-d sample_id sample_id] [--length LENGTH] [--ignore-missing] [--missing-cutoff c] [--mask MASK] [--drop-first-last] vcf.gz out[.gz] contig pop1 [pop2]
smc++ vcf2smc: error: argument pop1: 'NA12877' should be a comma-separated list of sample ids preceded by a population identifier. See 'smc++ vcf2smc -h'.
usage: smc++ vcf2smc [-h] [-v] [--cores CORES] [-d sample_id sample_id] [--length LENGTH] [--ignore-missing] [--missing-cutoff c] [--mask MASK] [--drop-first-last] vcf.gz out[.gz] contig pop1 [pop2]
smc++ vcf2smc: error: argument pop1: 'NA12877' should be a comma-separated list of sample ids preceded by a population identifier. See 'smc++ vcf2smc -h'.
usage: smc++ vcf2smc [-h] [-v] [--cores CORES] [-d sample_id sample_id] [--length LENGTH] [--ignore-missing] [--missing-cutoff c] [--mask MASK] [--drop-first-last] vcf.gz out[.gz] contig pop1 [pop2]
smc++ vcf2smc: error: argument pop1: 'NA12877' should be a comma-separated list of sample ids preceded by a population identifier. See 'smc++ vcf2smc -h'.
smc++ vcf2smc example.vcf.gz chr1.smc.gz chr1 CEU:NA1885,NA3861
827 smcpp.commands.vcf2smc WARNING Neither missing cutoff (-c) or mask (-m) has been specified. This means that stretches of the chromosome that do not have a
ny VCF entries (for example, centromeres) will be interpreted as homozygous recessive.
827 smcpp.commands.vcf2smc INFO Population 1:
827 smcpp.commands.vcf2smc INFO Distinguished lineages: NA1885:0, NA1885:1
827 smcpp.commands.vcf2smc INFO Undistinguished lineages: NA3861:0, NA3861:1
Traceback (most recent call last):
File "/home/exouser/.local/bin/smc++", line 8, in
sys.exit(main())
File "/home/exouser/.local/lib/python3.8/site-packages/smcpp/frontend/console.py", line 28, in main
cmds[args.command].main(args)
File "/home/exouser/.local/lib/python3.8/site-packages/smcpp/commands/vcf2smc.py", line 128, in main
vcf = VariantFile(args.vcf)
File "pysam/libcbcf.pyx", line 4117, in pysam.libcbcf.VariantFile.init
File "pysam/libcbcf.pyx", line 4347, in pysam.libcbcf.VariantFile.open
ValueError: invalid file b'example.vcf.gz' (mode=b'r') - is it VCF/BCF format?
@willright28 kindly send me your header info from vcf.gz file. If, possible then example data set from your original data,
so that i can do necessary changes accordingly
@terhorst @willright28 i'm using ubuntu linux application on windows10
Regards
Thankyou