Skip to content

Review updates #50

@KristinaRiemer

Description

@KristinaRiemer

Required for review

  • Increase figures font size - 16fbb78
  • Enable users to get summaries for 1 or 2 metrics - 0b364e7
  • Add downloadable data template file - 6c319b6
  • Add to CPUE figure explanation? - screenshot needed to include figure caption
  • Make error bars stand out more - e2da2a4
  • Display duplicate data?
  • Add comment to instructions about most of data being repeated - attempt to condense user-provided data

For the future

  • Move to .fst files to improve app performance

Comments

I realize I crammed six plots onto each figure. Therefore, the font size naturally is small. However, I can see where people will want to group figures for reports in the future. Therefore, can we please increase the size of the fonts for all figures (axes, labels, legends) and printouts? Reviewer 2 comments: Figures 3 and 5, The text in these figures is much too small...to the point of almost being unreadable even if you zoom in on the pdf (it gets pixelated before it is easily readable). I realize these are actual screen shots of the web site, but this is probably a sign that the web site figure fonts should be increased for the benefit of the web site, which in turn would help this publication. Reviewer 1 comments: Reviewer comments: For the figures showing data (not screenshots)- the font size is way too small to read. Figures 3 and 5 do not define the letters for size distribution categories.

See below reviewer comments. Can we fix this?

Reviewer: I attempted to upload a dataset without weights, assuming I could make comparisons with CPUE and length distribution even if I didn’t have weight data (as will be the case for many surveys) and was given an error when I uploaded with a weight column with empty cells, a weight column with “NA”, and with no weight column. The only way I could get the summaries for CPUE and length distribution was to make up numeric weight data. This is problematic.

Similarly, the webtool seems unable to handle a survey with only catch and effort data where a person might want to compare CPUE, but has no individual fish length or weight information. This seems like a relatively common scenario but the tool will not allow missing columns or empty cells or NAs. How could I compare CPUE if I didn’t have length and weight?

Possible? Reviewer comments: It would also be incredibly helpful to have a data template that one could download with the required column headers already filled in.

Can these explanations easily be added to the CPUE plots? Reviewer Comments: do the ends of the boxes and the span of the line on the boxplot represent (total data range, 95% CI, something else)? We may need to double check with Erin on this one.

See this comment. This doesn’t require programming just an answer. I believe we do have error bars in the length aggregation categories. The error is just so small we can’t see it. Correct? Reviewer comments: How did you handle the aggregation of length data across multiple surveys - no error bars or variability are shown?

Can this be done easily? Otherwise perhaps a more extensive future update? Reviewer comments: I noticed your shiny app can be a little slow (unresponsive), which is certainly understandable with almost 2 million fish records! I have some experience with shiny apps and R and wanted to pass along that I have found massive speed improvements by moving from .csv files to .fst files (produced with the fst package in R) and data.table-optimized functions in place of dplyr-like or base R statements for data manipulation. The compressed fst files load much faster and the indexing saved with fst files and multithreaded operations of data.table functions produce major processing speed improvements too. There is absolutely no need to respond to this comment to get your manuscript published, but I wanted to pass along what I had learned in case it helps you improve future versions of this valuable tool (and perhaps you are already using these speed-enhancing techniques...this is a huge database and some lag due to processing time is to be expected and is ultimately unavoidable).

Can the print out show how many duplicate data sets from the same waterbody are present? We would not print the name of the lake, just a waterbody number and number of times it was sampled. In a table, something like Waterbody 1, 1, Waterbody 2, 4, Waterbody 3, 1, Waterbody 4, 2…Waterbody x, z. Reviewer comments: Even within the summaries, some critical information is lacking. For example, why are the coordinates of the survey location not available? Why can’t we see the year that a survey took place? If a single water body has multiple years of data for the same species/gear, how is that handled? This seems critical. Additionally, users interested in making relevant comparisons for their purposes might want to filter based on lake/stream size, date of sampling (what separates fall vs spring, and what if we want to employ a more stringent filter)? I understand that implementing such filters in the app is likely not reasonable, but the tabular data available for download could include this information for users to filter themselves.

How tough to fix this comment? Reviewer comment: It would be good to call out that the data frame should have repeated rows for most of the values when including length/weight data.

For the User Input, right now, the CSV file requires repeated rows of many variables to be uploaded. Could we put these items that will be the same for every record in the data file into a single input box to be uploaded in the webtool? For example maybe a box like this:

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions