Data Analysis Report

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    Wondering how the new magic phase feels like? Try it yourself! The Behind The Scenes blog gives you enough to playtest it, including spells of four paths of magic, all hereditary spells and the Dwarven runes!

    • Data Analysis Report

      See here first draft version of the balance report.
      drive.google.com/file/d/1CkiBA…2YoyiTlLlw_QMJWnsgfW/view

      For those who are interested in, here you can find the raw data and the analysis sheet
      drive.google.com/file/d/1wX0dZ…1NbPN-xun7ohj3_sxvk-/view


      More text and better layout will (hopefully) come.
      Feel free to give feedback.
    • You need to get ahead of the community. They are going to look at these plots and go Army X is over performing why didn't they get nerfed, or army Y is under performing where was our buffs. You need something in the report that directly addresses that issue, so you can point to the analysis and say no, see ehre we checked that. The first DA report suffered from that issue, and the first report straight up provided tests of the significance of deviations from the mean for each army. So I can guarantee it will appear again, and if you don't have it in the report, you will be on the back foot when those accusations start flying.

      And this time I'm not spending 3 weeks explaining it to the community, that's your job now :P
      “You can never know everything, and part of what you know is always wrong. Perhaps even the most important part. A portion of wisdom lies in knowing that. A portion of courage lies in going on anyways.” -Lan Mandragoran, EotW

      UD Army Community Support

      Playtester

      Supreme Death Cult Hierarch

      Dovie’andi se tovya sagain.
    • Thank you for this important contribution to the 9th age age. Having balanced gaming is essential in every way for making the gaming experience sustainable and enjoyable for all enthusiasts.

      I have several concerns and comments about the "balance-report":
      - no conclusions are presented
      - even though hardly any graphs are self-explanatory, no graphs are explained.
      - I would recommend to start with the most interesting findings - which as I understand it, appears to be the last slide (12). I understand this as if the "certainty bars" cross 0.5 it would seem plausible that the each race would have this many wins at random, because the races are balanced. But(!) why then 68% confidence intervals?? It's far to uncertain, and should rather be 95% confidence intervals as usually used in science. - maybe this would show that all appears balanced instead of 6 appearing unbalanced.
      - I would be interested in seeing what are actually weighted for in the report and explained. The last four graphs could be compounded into a single figure with a point estimate and a confidence interval (crude and adjusted). Why even show slide 11?
      - I cannot figure out what the "ratio tournament placement section" figures are trying to show
      -number of contributions: is this number of games where each race occur?

      Overall I think this is important work, that would benefit from if it's authors tried to make themselves clear what question they are trying to answer! I think it is essential work for the gaming system and that future reports (likely evolving around 2.0) would benefit from a pre-prepared structured statistical analysis plan.

      Thank you for your contribution to the game, I hope you can use my comments.

      The post was edited 1 time, last by Jakob Knudsen ().

    • Question:

      Why are you using placement based on soft scores for your analysis of the balance of the game? Doesn't that just contaminate you results with variables that have nothing to do with the balance of the game?
      “You can never know everything, and part of what you know is always wrong. Perhaps even the most important part. A portion of wisdom lies in knowing that. A portion of courage lies in going on anyways.” -Lan Mandragoran, EotW

      UD Army Community Support

      Playtester

      Supreme Death Cult Hierarch

      Dovie’andi se tovya sagain.
    • @Nicreap you got anything specific in mind in terms of deviations from the mean?
      The error bars on the final page not enough and/or the wrong thing?

      @arwaker Doesn't look like you added the horizontal line y=1/7 to the distribution plots?


      @Jakob Knudsen arwaker has said he hasn't got time for that at the moment. He has offered to give some notes to someone who has time to do it, if you wish to do so?
      re: the 68% intervals, I suspect the 95% intervals will show every army to be balanced; none of them are that far away at 68%.
      For science done properly, even 99.7% is dodgy, depending on how well look elsewhere effects have been handled.
      What level of certainty we should require here is an interesting question, maybe we should ask the community.

      Balancing team

      EoS Community Support

      "Two things gamers hate most is change and the way things are" - Stygian
    • @DanT Yes, well that's perfectly fair, it is after all voluntary work, and I appreciate both that and the work. - which is why I'm spending my evening commenting my very first post on the 9th age forum. I'm giving my suggestions for improvements if or when time comes. And feedback asked for by arwaker.

      I realize that 95% isn't absolute certainty at all (which is lucky since it is how I make a living), but in my opinion 68% is nothing at all. Actually worse because people in general who aren't really that interested in data analysis may wrongly conclude that their army is under or overpowered, simply because a too narrow confidence interval has been chosen. This may create a demand to change something that's working as intended and (much worse) take away some joy from players misenterpreting the results.
    • Jakob Knudsen wrote:

      @DanT Yes, well that's perfectly fair, it is after all voluntary work, and I appreciate both that and the work. - which is why I'm spending my evening commenting my very first post on the 9th age forum. I'm giving my suggestions for improvements if or when time comes. And feedback asked for by arwaker.

      I realize that 95% isn't absolute certainty at all (which is lucky since it is how I make a living), but in my opinion 68% is nothing at all. Actually worse because people in general who aren't really that interested in data analysis may wrongly conclude that their army is under or overpowered, simply because a too narrow confidence interval has been chosen. This may create a demand to change something that's working as intended and (much worse) take away some joy from players misenterpreting the results.
      It was a serious suggestion. Your comment suggests you'd be a good person to turn his notes into commentary :)
      Get in touch with arwaker if you feel you could spare the time.

      I don;t disgree about 95%, but I know what will happen:
      It will show all armies are balanced, and people will say that level of confidence is unnecessary.
      Hence, maybe we should ask the community what level of confidence they are interested in before they see the data.
      Always better to decide these things a priori rather than a posteriori, otherwise it just generates more problems.

      Balancing team

      EoS Community Support

      "Two things gamers hate most is change and the way things are" - Stygian
    • DanT wrote:

      The error bars on the final page not enough and/or the wrong thing?
      The community is going to look at that and go

      BH, DL, ID, OK, SA, and UD are all overperforming why weren't they nerfed? as well as
      DE, DH, EoS, Ong, and VS are all underperformign why didn't they get more buffs?

      The reason? The mean is below 50% and the error bars don't touch 50% therefore they MUST be underperforming, it doesn't mean it's true, but considering i spent nearly a month explaining and reexplaining the first DA report to the public, I'm giving the warning now that particularly with 0 explanation provided, that is the arguments backed by the "data" that are going to pop up all over the forum. Because it WILL show up all over the forum, particularly with 0 explanations in the report.

      I'm digging into the data provided to see if we can provide something more definitive, but school+research keeps me busy.


      DanT wrote:

      Hence, maybe we should ask the community what level of confidence they are interested in before they see the data.
      this is on the public forums, the cats outta the bag :P
      “You can never know everything, and part of what you know is always wrong. Perhaps even the most important part. A portion of wisdom lies in knowing that. A portion of courage lies in going on anyways.” -Lan Mandragoran, EotW

      UD Army Community Support

      Playtester

      Supreme Death Cult Hierarch

      Dovie’andi se tovya sagain.
    • FWIW I think this release was premature. I 100% believe the data should be made public, but analysis without interpretation or context is asking for trouble. Totally get why you wouldn't have time to do it right now, but I agree with @Nicreap that this risks more confusion (or worse - misinformation!) than it solves right now. That said, I will look it over at the weekend and see what conclusions my weak stats skills might draw from it...

      Nicreap wrote:

      DanT wrote:

      The error bars on the final page not enough and/or the wrong thing?
      The community is going to look at that and go
      BH, DL, ID, OK, SA, and UD are all overperforming why weren't they nerfed? as well as
      DE, DH, EoS, Ong, and VS are all underperformign why didn't they get more buffs?
      I haven't read the report or even looked at the graph yet but that picture roughly matches my gut feel about those armies, broadly speaking. Rargh, gut feelz ;).

      nicreap wrote:

      The reason? The mean is below 50% and the error bars don't touch 50% therefore they MUST be underperforming, it doesn't mean it's true, but considering i spent nearly a month explaining and reexplaining the first DA report to the public, I'm giving the warning now that particularly with 0 explanation provided, that is the arguments backed by the "data" that are going to pop up all over the forum. Because it WILL show up all over the forum, particularly with 0 explanations in the report.
      I'm digging into the data provided to see if we can provide something more definitive, but school+research keeps me busy.
      I would tend to understand mean below 50% and error bars not touching it as performing less well than chance. The only way I think I might understand that army to not be underperforming is if its error bars, while not overlapping with 50%, did however also overlap with other armies. Then it would only be armies which had non-overlapping error bars which over/under perform relative to each other. Am I onto the right lines here?
      Join us on Ulthuan.net
    • ferny wrote:

      I would tend to understand mean below 50% and error bars not touching it as performing less well than chance. The only way I think I might understand that army to not be underperforming is if its error bars, while not overlapping with 50%, did however also overlap with other armies. Then it would only be armies which had non-overlapping error bars which over/under perform relative to each other. Am I onto the right lines here?

      You have to think about this like a survey vs a general election. We have a survey of people, 68% error bars mean that the result of the general election has a 68% chance to be within the top and the bottom bar.

      As it's been pointed out, 95% to 99.X% confidence is usually required before drawing any valuable conclusions. 95% confidence intervals would be way wider.
      Le vent se lève !... Il faut tenter de vivre !
    • Neaj wrote:

      ferny wrote:

      I would tend to understand mean below 50% and error bars not touching it as performing less well than chance. The only way I think I might understand that army to not be underperforming is if its error bars, while not overlapping with 50%, did however also overlap with other armies. Then it would only be armies which had non-overlapping error bars which over/under perform relative to each other. Am I onto the right lines here?
      You have to think about this like a survey vs a general election. We have a survey of people, 68% error bars mean that the result of the general election has a 68% chance to be within the top and the bottom bar.

      As it's been pointed out, 95% to 99.X% confidence is usually required before drawing any valuable conclusions. 95% confidence intervals would be way wider.
      So the issue is with the confidence level itself rather than the interpretation? i.e. @Nicreap 's hypothetical conclusion about which armies are over and under performing is correct based on the data as presented, where correct means "68% likely to be correct"? i.e. the point of debate would be on whether 68% is high enough to have confidence in, rather than that the interpretation subject to that confidence is correct?
      Join us on Ulthuan.net
    • ferny wrote:

      i.e. @Nicreap 's hypothetical conclusion about which armies are over and under performing is correct based on the data as presented
      It's hard to say, because the data as presented isn't sufficient to really claim one way or the other. You could argue that there are trends that seem to be present in the data, but there hasn't been a quantitive analysis that can state with any certainty one way or the other.

      In short, there isn't really enough information to argue I am right, but you at the same time can't really claim my interpretation is wrong, because the data presented isn't detailed enough to make such distinctions. AKA, anyone can claim whatever they want and we can't really say they are wrong with the data at present.

      Because, yes, there appears to be a trend in the data showing under and overperforming armies, which we apparently have not really quantified, so the conclusion that VS is underperforming, can't really be disproved by the data report (the fact it is a weak argument at present is pretty much irrelevant, because all the community will do is emphasize the fact that there is nothing in the data to support the counter arguments that VS is just fine), and as mentioned before, I don't have the time (annd I'm not on DA, so it's not my fires anyway :P ) so I'm not going to run around and put out those fires. I will leave that in the hands of the DA team, since they are better equipped to answer all those questions and refute those poorly supported claims :D :evil:
      “You can never know everything, and part of what you know is always wrong. Perhaps even the most important part. A portion of wisdom lies in knowing that. A portion of courage lies in going on anyways.” -Lan Mandragoran, EotW

      UD Army Community Support

      Playtester

      Supreme Death Cult Hierarch

      Dovie’andi se tovya sagain.

      The post was edited 1 time, last by Nicreap ().