powerpivot - Calculating an average of a DISTINCTCOUNT efficiently in Dax? -


i'm trying calculate business-logic in dax has turned out quite resource-heavy , complex. have large powerpivot model (call "sales") numerous dimensions , measures. simplified view of sales model:

+-------+--------+---------+------+---------+-------+ | state |  city  |  store  | week | product | sales | +-------+--------+---------+------+---------+-------+ | ny    | nyc    | charlie |    1 |       | $5    | | ma    | boston | bravo   |    2 | b       | $10   | | -     | d.c.   | delta   |    1 |       | $20   | +-------+--------+---------+------+---------+-------+ 

essentially i'm trying calculate distinctcount of product store , week:

summarize(sales,[store],[week],"distinct products",distinctcount([product]))  +---------+------+-------------------+ |  store  | week | distinct products | +---------+------+-------------------+ | charlie |    1 |                15 | | charlie |    2 |                 7 | | charlie |    3 |                12 | | bravo   |    1 |                20 | | bravo   |    2 |                14 | | bravo   |    3 |                22 | +---------+------+-------------------+ 

i want calculate average of these distinct products @ store level. way approached taking previous calculation, , running sumx on top of , dividing distinct weeks:

sumx( summarize(sales,[store],[week],"distinct products",distinctcount([product])) ,[distinct products] ) / distinctcount([week])  +---------+------------------+ |  store  | average products | +---------+------------------+ | charlie | 11.3             | | bravo   | 18.7             | +---------+------------------+ 

i stored calculation in measure , worked when dataset smaller. dataset huge when try use measure, hangs until have cancel process.

is there more efficient way this?

sumx appropriate in case since want distinct product count calculated independently each store & each week, summed store, , divided number of weeks store. there's no way around that. (if there was, i'd recommend it.)

however, sumx iterator, , cause of slowdown. since can't eliminate sumx entirely, biggest factor here number of combinations of stores/weeks have.

to confirm if number of combinations of stores/weeks source of slowdown, try filtering or removing 50% copy of data model , see if speeds things up. if doesn't time out, add more in sense of how many combinations failing point.

to make things faster full dataset:

  1. you may able filter subset of stores/weeks in pivot table, before dragging on measure. typically faster results dragging on measure first, adding filters. (this isn't change measure, more of behaviour change users of model).
  2. you might want consider grouping @ higher level week (e.g. month), reduce number of combinations has iterate over
  3. if you're running excel 32-bit, or have 4gb of ram, consider 64-bit excel and/or more powerful machine (i doubt case, including comprehensiveness - power pivot can resource hog)
  4. if can move model power bi desktop (i don't believe calculated tables supported in power pivot), extract out summarize calculated table, , re-write measure reference calculated table instead. reduces number of calculations measure has perform @ run-time, combinations of store/week plus distinct count of products pre-calculated (leaving summing & division measure - lot less work).

.

calculated table = summarize (     sales,     [store],     [week],     "distinct products", distinctcount ( sales[product] ) ) 

note: calculated table code above rudimentary , designed proof of concept. if path take, you'll want make sure have separate store dimension join calculated table to, won't join source table directly

measure using calc table = sumx (     'calculated table',     [distinct products] / distinctcount ( 'calculated table'[week] ) ) 

jason thomas has great post on calculated tables , when can come in useful here: http://sqljason.com/2015/09/my-thoughts-on-calculated-tables-in.html.

if can't use calculated tables, data coming database of form, same logic in sql , import pre-prepared separate table of unique store/months , distinct counts.

i hope of proves useful (or you've solved problem way).


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -